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Method and nucleic acids for the improved treatment of breast cell proliferative disor- 
ders 

Field of the Invention 

In American women, breast cancer is the most frequently diagnosed cancer and the second 
leading cause of cancer death. In women aged 40-5 5 , breast cancer is the leading cause of 
death (Greenlee et al, 2000). In 2002, there were 204,000 new cases of breast cancer in the 
US (data from the American Society of Clinical Oncology) and a comparable number in 
Europe. 

Breast cancer is defined as the uncontrolled proliferation of cells within breasts tissues. 
Breasts are comprised of 15 to 20 lobes joined together by ducts. Cancer arises most com- 
monly in the duct, but is also found in the lobes with the rarest type of cancer termed inflam- 
matory breast cancer. It will be appreciated by those skilled in the art that there exists a con- 
tinuing need to improve methods of early detection, classification and treatment of breast can- 
cers. In contrast to the detection of some other common cancers such as cervical and dermal 
there are inherent difficulties in classifying and detecting breast cancers. 

Due to current screening programs and the accessibility of this cancer to self-examination, 
breast cancer is diagnosed comparatively early: in about 93% of all newly diagnosed cases, 
the cancer has not yet metastasized, and in 65% of cases, even the lymph nodes are not yet 
affected. 

The first step of any treatment is the assessment of the patient's condition comparative to de- 
fined classifications of the disease. However the value of such a system is inherently depend- 
ent upon the quality of the classification. Breast cancers are staged according to their size, 
location and occurrence of metastasis. Methods of treatment include the use of surgery, radia- 
tion therapy, chemotherapy and endocrine therapy, which are also used as adjuvant therapies 
to surgery. 
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Although the vast majority of early cancers are operable, i.e. the tumor can be completely 
removed by surgery, about one third of the patients with lymph-node negative diseases and 
about 50-60% of patients with node-positive disease will develop metastases during follow- 
up. 

Based on this observation, systemic adjuvant treatment has been introduced for both node- 
positive and node-negative breast cancers. Systemic adjuvant therapy is administered after 
surgical removal of the tumor, and has been shown to reduce the risk of recurrence signifi- 
cantly (Early Breast Cancer Trialists' Collaborative Group, 1998). Several types of adjuvant 
treatment are available: endocrine treatment (for hormone receptor positive tumors), different 
chemotherapy regimens, and novel agents like Herceptin. 

The growth of the majority of breast cancers (appr. 70-80%) is dependent on the presence of 
estrogen. Therefore, one important target for adjuvant therapy is the removal of estrogen (e.g. 
by ovarian ablation) or the blocking of its actions on the tumor cells (e.g. Tamoxifen). Endo- 
crine treatment is thought to be efficient only in tumors that express hormone receptors (the 
estrogen receptor, ER, and/or the progesterone receptor, PR), Currently, the vast majority of 
women with hormone receptor positive breast cancer receive some form of endocrine treat- 
ment, independent of their nodal status. The most frequently used drug is Tamoxifen. How- 
ever, even in hormone receptor positive patients, not all patients benefit from endocrine 
treatment Adjuvant endocrine therapy reduces mortality rates by 22% while response rates to 
endocrine treatment in the advanced setting are 50 to 60% (Jordan et aL, 2002, Jordan et aL, 
1999, Osborne et al., 1998, European Breast Cancer Cooperative Group, 1998). 

Since Tamoxifen has relatively few side effects, treatment may be justified even for patients 
with low likelihood of benefit. However, these patients may require additional, more aggres- 
sive adjuvant treatment. This is supported by the fact that, even in earliest and least aggressive 
tumors, such as node-negative, hormone receptor positive tumours, about 21 % of patients 
relapse within 1 0 years after initial diagnosis if they receive Tamoxifen monotherapy as adju- 
vant treatment (Early Breast Cancer Trialists Collaborative Group. Lancet, 1998). 

Several cytotoxic regimens have shown to be effective in reducing the risk of relapse in breast 
cancer (Mansour et dl. 9 1998). According to current treatment guidelines, most node-positive 
patients receive adjuvant chemotherapy both in the US and Europe, since the risk of relapse is 
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considerable. Nevertheless, not all patients do relapse, and there is a proportion of patients 
who would never have relapsed even without chemotherapy, but who nevertheless receive 
chemotherapy due to the currently used criteria. In hormone receptor positive patients, che- 
motherapy is usually given before endocrine treatment, whereas hormone receptor negative 
patients receive only chemotherapy. 

The situation for node-negative patients is particularly complex. In the US, cytotoxic chemo- 
therapy is recommended for node-negative patients, if the tumor is larger than 1 cm. In 
Europe, chemotherapy is considered for the node-negative cases if one or more risk factors 
such as tumor size larger than 2 cm, negative hormone receptor status, or tumor grading of 
three or age <3 5 is present. In general, there is a tendency to select premenopausal women for 
additional chemotherapy whereas for postmenopausal women, chemotherapy is often omitted. 
Compared to endocrine treatment, in particular Tamoxifen, chemotherapy is highly toxic, 
with short-term side effects such as nausea, vomiting, bone marrow depression, and long-term 
effects such as cardiotoxicity and an increased risk for secondary cancers. 

It is currently not clear which breast cancer patients should be selected for more aggressive 
therapy, although clinicians agree that there is a need for a subset of patients. The difficulty of 
selecting the right patients for chemotherapy, and the lack of suitable criteria is also reflected 
by a recent study which showed that chemotherapy is used much less frequently than recom- 
mended, based on data from the New Mexico Tumor registry (Du et aL, 2003). This study 
provides substantial evidence that there is a need for better selection of patients for chemo- 
therapy or other, more aggressive forms of breast cancer therapy. 

The levels of observation that have been studied by the methodological developments of re- 
cent years in molecular biology, are the genes themselves, the translation of these genes into 
RNA, and the resulting proteins. The question of which gene is switched on at which point in 
the course of the development of an individual, and how the activation and inhibition of spe- 
cific genes in specific cells and tissues are controlled is correctable to the degree and charac- 
ter of the methylation of the genes or of the genome. In this respect, pathogenic conditions 
may manifest themselves in a changed methylation pattern of individual genes or of the ge- 
nome. 
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DNA methylation plays a role, for example, in the regulation of the transcription, in genetic 
imprinting, and in tumorigenesis. Therefore, the identification of 5-methylcytosine as a com- 
ponent of genetic information is of considerable interest. However, 5-methylcytosine posi- 
tions cannot be identified by sequencing since 5-methylcytosine has the same base pairing 
behaviour as cytosine. Moreover, the epigenetic information carried by 5-methylcytosine is 
completely lost during PCR amplification. 

The currently most frequently used method for analysing DNA for 5-methylcytosine is based 
upon the specific reaction of bisulphite with cytosine which, upon subsequent alkaline hy- 
drolysis, is converted to uracil which corresponds to thymine in its base pairing behaviour. 
However, 5-methylcytosine remains unmodified under these conditions. Consequently, the 
original DNA is converted in such a manner that methylcytosine, which originally could not 
be distinguished from cytosine by its hybridisation behaviour, can now be detected as the only 
remaining cytosine using "normal" molecular biological techniques, for example, by amplifi- 
cation and hybridisation or sequencing. All of these techniques are based on base pairing 
which can now be fully exploited. In terms of sensitivity, the prior art is defined by a method 
which encloses the DNA to be analysed in an agarose matrix, thus preventing the diffusion 
and renaturation of the DNA (bisulphite only reacts with single-stranded DNA), and which 
replaces all precipitation and purification steps with fast dialysis (Olek A, Oswald J, Walter J. 
A modified and improved method for bisulphite based cytosine methylation analysis. Nucleic 
Acids Res. 1996 Dec 15; 24(24): 5064-6). Using this method, it is possible to analyse individ- 
ual cells, which illustrates the potential of the method. However, currently only individual 
regions of a length of up to approximately 3000 base pairs are analysed, a global analysis of 
cells for thousands of possible methylation events is not possible. However, this method can- 
not reliably analyse very small fragments from small sample quantities either. These are lost 
through the matrix in spite of the diffusion protection. 

An overview of the further known methods of detecting 5-methylcytosine may be gathered 
from the following review article: Rein, T., DePamphilis, M, L., Zorbas, H., Nucleic Acids 
Res. 1998, 26, 2255. 

To date, barring few exceptions (e.g., Zeschnigk M, Lich C, Buiting K, Doerfler W, and 
Horsthemke B. A single-tube PCR test for the diagnosis of Angelman and Prader-Willi syn- 
drome based on allelic methylation differences at the SNRPN locus. Eur J Hum Genet. 1997 
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Mar- Apr; 5 (2) : 94-8) the bisulphite technique is only used in research. Always, however, short, 
specific fragments of a known gene are amplified subsequent to a bisulfite treatment and ei- 
ther completely sequenced (Olek A, Walter J. The pre-implantation ontogeny of the HI 9 
methylation imprint. Nat Genet. 1997 Nov;17(3):275-6) or individual cytosine positions are 
detected by a primer extension reaction (Gonzalgo ML, Jones PA. Rapid quantitation of 
methylation differences at specific sites using methylation-sensitive single nucleotide primer 
extension (Ms-SNuPE). Nucleic Acids Res. 1997 Jun 15;25(12):2529-31, WO 95/00669) or 
by enzymatic digestion (Xiong Z, Laird PW. COBRA: a sensitive and quantitative DNA 
methylation assay. Nucleic Acids Res. 1997 Jun 15;25(12):2532-4). In addition, detection by 
hybridisation has also been described (Olek et al., WO 99/28498). 

Further publications dealing with the use of the bisulfite technique for methylation detection 
in individual genes are: Grigg G, Clark S. Sequencing 5-methylcytosine residues in genomic 
DNA. Bioessays. 1994 Jun;16(6):431-6, 431; Zeschnigk M, Schmitz B, Dittrich B, Buiting K, 
Horsthemke B, Doerfler W. Imprinted segments in the human genome: different DNA meth- 
ylation patterns in the Prader-Willi/Angelman syndrome region as determined by the genomic 
sequencing method. Hum Mol Genet. 1997 Mar;6(3):387-95; Feil R, Charlton J, Bird AP, 
Walter J, Reik W. Methylation analysis on individual chromosomes: improved protocol for 
bisulphite genomic sequencing. Nucleic Acids Res. 1994 Feb 25;22(4):695-6; Martin V, 
Ribieras S, Song- Wang X, Rio MC, Dante R. Genomic sequencing indicates a correlation 
between DNA hypomethylation in the 5' region of the pS2 gene and its expression in human 
breast cancer cell lines. Gene. 1995 May 19;157(l~2):261-4; WO 97/46705, WO 95/15373, 
and WO 97/45560. 

An overview of the Prior Art in oligomer array manufacturing can be gathered from a special 
edition of Nature Genetics (Nature Genetics Supplement, Volume 21, January 1999), pub- 
lished in January 1999, and from the literature cited therein. 

Fluorescently labelled probes are often used for the scanning of immobilised DNA arrays. 
The simple attachment of Cy3 and Cy5 dyes to the 5 '-OH of the specific probe is particularly 
suitable for fluorescence labels. The detection of the fluorescence of the hybridised probes 
may be carried out, for example via a confocal microscope. Cy3 and Cy5 dyes, besides many 
others, are commercially available. 
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Matrix Assisted Laser Desorption lonisation Mass Spectrometry (MALDI-TOF) is a very 
efficient development for the analysis of biomolecules (Karas M, Hillenkamp F. Laser de- 
sorption ionisation of proteins with molecular masses exceeding 10,000 Daltons. Anal Chem. 
1988 Oct 15;60(20):2299-301). An analyte is embedded in a light-absorbing matrix. The ma- 
trix is evaporated by a short laser pulse thus transporting the analyte molecule into the vapour 
phase in an unfragmented manner. The analyte is ionised by collisions with matrix molecules. 
An applied voltage accelerates the ions into a field-free flight tube. Due to their different 
masses, the ions are accelerated at different rates. Smaller ions reach the detector sooner than 
bigger ones. 

MALDI-TOF spectrometry is excellently suited to the analysis of peptides and proteins. The 
analysis of nucleic acids is somewhat more difficult (Gut I G, Beck S. DNA and Matrix As- 
sisted Laser Desorption Ionization Mass Spectrometry. Current Innovations and Future 
Trends. 1995, 1; 147-57). The sensitivity to nucleic acids is approximately 100 times worse 
than to peptides and decreases disproportionally with increasing fragment size. For nucleic 
acids having a multiply negatively charged backbone, the ionisation process via the matrix is 
considerably less efficient In MALDI-TOF spectrometry, the selection of the matrix plays an 
eminently important role. For the desorption of peptides, several very efficient matrixes have 
been found which produce a very fine crystallisation. There are now several responsive ma- 
trixes for DNA, however, the difference in sensitivity has not been reduced. The difference in 
sensitivity can be reduced by chemically modifying the DNA in such a manner that it be- 
comes more similar to a peptide. Phosphorothioate nucleic acids in which the usual phos- 
phates of the backbone are substituted with thiophosphates can be converted into a charge- 
neutral DNA using simple alkylation chemistry (Gut IG, Beck S. A procedure for selective 
DNA alkylation and detection by mass spectrometry. Nucleic Acids Res. 1995 Apr 25; 23(8): 
1367-73). The coupling of a charge tag to this modified DNA results in an increase in sensi- 
tivity to the same level as that found for peptides. A further advantage of charge tagging is the 
increased stability of the analysis against impurities which make the detection of unmodified 
substrates considerably more difficult. 

Genomic DNA is obtained from DNA of cell, tissue or other test samples using standard 
methods. This standard methodology is found in references such as Sambrook, Fritsch and 
Maniatis, Molecular Cloning: A Laboratory Manual, CSH Press, 2nd edition, 1989: Isolation 
of genomic DNA from mammalian cells, Protocol I, p. 9.16 - 9.19. Also the manuals of sev- 
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eral DNA extraction kits such as the QIAamp DNA mini kit give a good guidance on how to 
isolate genomic DNA. 

Currently several predictive markers are under evaluation. As up to now most patients have 
received Tamoxifen as endocrine treatment most of the markers have been shown to be asso- 
ciated with response or resistance to Tamoxifen. However, it is generally assumed that there 
is a large overlap between responders to one or the other endocrine treatment. In fact, ER and 
PR expression are used to select patients for any endocrine treatment. Among the markers 
which have been associated with TAM response is bcl-2. High bcl-2 levels showed promising 
correlation to TAM therapy response in patients with metastatic disease and prolonged sur- 
vival and added valuable information to an ER negative patient subgroup (J Clin Oncology, 
1997, 15 5: 1916-1922; Endocrine, 2000, 13(1):1-10). There is conflicting evidence regarding 
the independent predictive value of c-erbB2 (Her2/neu) overexpression in patients with ad- 
vanced breast cancer that require further evaluation and verification (British J of Cancer, 
1999, 79 (7/8): 1220-1226; J Natl Cancer Inst, 1998, 90 (21): 1601-1608). 

Other predictive markers include SRC-1 (steroid receptor coactivator-1), CGA gene over ex- 
pression, cell kinetics and S phase fraction assays (Breast Cancer Res and Treat, 1998, 48:87- 
92; Oncogene, 2001, 20:6955-6959). Recently, uPA (Urokinase-type plasminogen activator) 
and PAI-1 (Plasminogen activator inhibitor type 1) together showed to be useful to define a 
subgroup of patients who have worse prognosis and who would benefit from adjuvant sys- 
temic therapy (J Clinical Oncology, 2002, 20 n° 4). However, all of these markers need fur- 
ther evaluations in prospective trials as none of them is yet a validated marker of response. 

A number of cancer-associated genes have been shown to be inactivated by hypermethylation 
of CpG islands during breast tumorigenesis. Decreased expression of the calcium binding 
protein S100A2 (Accession number NM_005978) has been associated with the development 
of breast cancers. Hypermethylation of the promoter region of this gene has been observed in 
neoplastic cells thus providing evidence that S 1 00A2 repression in tumour cells is mediated 
by site-specific methylation. 

The gene SYK (Accession number NM_003177) encodes a protein tyrosine kinase, Syk 
(spleen tyrosine kinase), that is highly expressed in hematopoietic cells. Syk is expressed in 
normal breast ductal epithelial cells but not in a subset of invasive breast carcinoma. Also, the 
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loss of Syk expression seems to be associated with malignant phenotypes such as increased 
motility and invasion. The loss of expression occurs at the transcriptional level, and, as indi- 
cated by Yuan Y, Mendez R, Sahin A and Dai JL (Hypermethylation leads to silencing of the 
SYK gene in human breast cancer. Cancer Res. 2001 Jul 15;61(14):5558-61.), as a result of 
DNA hypermethylation. 

The TGF-fi type 2 receptor (encoded by the TGFBR2 gene, NM_003242) plays a role in 
trans-membrane signalling pathways via a complex of serine/threonine kinases. Mutations in 
the gene have been detected in some primary tumours and in several types of tumour-derived 
cell lines, including breast (Lucke CD, Philpott A, Metcalfe JC, Thompson AM, Hughes- 
Davies L, Kemp PR, Hesketh R. 'Inhibiting mutations in the transforming growth factor beta 
type 2 receptor in recurrent human breast cancer.' Cancer Res. 2001 Jan 15;61(2):482-5.). 

The genes COX7A2L and GRIN2D were both identified as novel estrogen responsive ele- 
ments by Watanabe et. al. (Isolation of estrogen-responsive genes with a CpG island library. 
Molec. Cell. Biol. 18: 442-449, 1998.) using the CpG-GBS (genomic binding site) method. 
The gene COX7A2L (Accession number NM_004718) encodes a polypeptide 2-like cyto- 
chrome C oxidase subunit VIIA. Northern blot analysis detected an upregulation of 
COX7A2L after estrogen treatment of a breast cancer cell line. The gene GRIN2D (Accession 
number NM_000836) encodes the N-methyl-D-aspartate, ionotropic, subunit 2D glutamate 
receptor, a subunit of the NMDA receptor channels associated with neuronal signalling. Fur- 
thermore expression of the cDNA has been observed in an osteosarcoma cell line. The gene 
VTN (also known as Vitronectin Accession number NM_000638) encodes a 75 -kD glyco- 
protein (also called serum spreading factor or complement S-protein) that promotes attach- 
ment and spreading of animal cells in vitro, inhibits cytolysis by the complement C5b-9 com- 
plex, and modulates antithrombin III-thrombin action in blood coagulation. Furthermore ex- 
pression of this gene has been linked to progression and invasiveness of cancer cells. 

The gene SFN (also known as Stratifin) encodes a polypeptide of the 14-3-3 family, 14-3-3 
sigma. The 14-3-3 family of proteins mediates signal transduction by binding to phosphoser- 
ine-containing proteins. Expression of the SFN gene is lost in breast carcinomas, this is likely 
due to hypermethylation during the early stages of neoplastic transformation (see Umbricht 
CB, Evron E, Gabrielson E, Ferguson A, Marks J, Sukumar S. Hypermethylation of 14-3-3 
sigma (stratifin) is an early event in breast cancer. Oncogene. 2001 Jun 7; 20(26):3348-53). 
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The gene PS ATI (Accession number NM_021 154) is not to be confused with the gene popu- 
larly referred to as PxySA (Accession number NMJ)01648) which encodes prostate specific 
antigen and whose technically correct name is kallikrein 3 . The gene PS ATI encodes the 
protein phosphoserine aminotransferase which is the second step-catalysing enzyme in the 
serine biosynthesis pathway. Changes in gene expression levels have been monitored by 
mRNA expression analysis and upregulation of the gene has been identified in colonic carci- 
noma in a study of 6 samples (Electrophoresis 2002 Jun;23(ll): 1667-76 mRNA differential 
display of gene expression in colonic carcinoma. Ojala P, Sundstrom J, Gronroos JM, Virtanen 
E, Talvinen K 5 Nevalainen TJ). 

The gene stathmin (NMJ305563) codes for an oncoprotein 18, also known as stathmin, a con- 
served cytosolic phosphoprotein that regulates microtubule dynamics. The protein is highly 
expressed in a variety of human malignancies. In human breast cancers the stathmin gene has 
shown to be up-regulated in a subset of the tumours. 

The gene PRKCD encodes a member of the family of protein kinase c enzymes, and is in- 
volved in B cell signaling and in the regulation of growth, apoptosis, and differentiation of a 
variety of cell types. 

Some of these molecules interact in a cascade-like manner. PRKCD activity that targets 
STMN1 is modulated by SFN binding and SYK phosphorylation. Together this influences 
tubulin polymerization that is required for cell division. 

The gene MSMB (Accession number NM_002443 ) has been mapped to 10qll.2. It encodes 
the beta-tnicroseminoprotein (MSP) which is one of the major proteins secreted by the pros- 
tate. Furthermore, it may be useful as a diagnostic marker for prostate cancer. Using mRNA 
analysis low levels of beta-MSP mRNA expression and protein have been linked to progres- 
sion under endocrine therapy and it has been postulated that it may be indicative of potentially 
aggressive prostate cancer (see Sakai H, Tsurusaki T, Kanda S, Koji T, Xuan JW, Saito Y 
'Prognostic significance of beta-microseminoprotein mRNA expression in prostate cancer.' 
Prostate. 1999 Mar l;38(4):278-84.). 
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The gene TP53 (Accession number NM_000546) encodes the protein p53, one of the most 
well characterised tumour suppressor proteins. The p53 protein acts as a transcription factor 
and serves as a key regulator of the cell cycle. Inactivation of this gene through mutation dis- 
rupts the cell cycle, which, in turn, assists in tumour formation. Methylation changes associ- 
ated with this gene have been reported to be significant in breast cancer. Saraswati et. al. 
(Nature 405, 974 - 978 (22 Jun 2000) 'Compromised HOXA5 function can limit p53 expres- 
sion in human breast tumours' reported that low levels of p53 mRNA in breast tumours was 
correlated to methylation of the HOXA5 gene. The product of the HOX5A gene binds to the 
promoter region of the p53 and mediates expression of the gene. Methylation of the promoter 
region of the p53 gene itself has been reported (Kang JH, Kim SJ, Noh DY, Park I A, Choe 
KJ, Yoo OJ, Kang HS. 'Methylation in the p53 promoter is a supplementary route to breast 
carcinogenesis: correlation between CpG methylation in the p53 promoter and the mutation of 
the p53 gene in the progression from ductal carcinoma in situ to invasive ductal carcinoma.' 
Lab Invest. 2001 Apr;81(4):573-9.). It was therein demonstrated that CpG methylation in the 
p53 promoter region is found in breast cancer and it was hypothesised that methylation in the 
p53 promoter region could be an alternative pathway to neoplastic progression in breast tu- 
mours. It has been observed that treatment with Tamoxifen decreases the level of expression 
of the p53 gene (Farczadi E, Kaszas I, Baki M, Szende B. 'Changes in apoptosis, mitosis, Her- 
2, p53 and Bcl2 expression in breast carcinomas after short-term tamoxifen treatment.' Neo- 
plasma. 2002;49(2): 101-3.) 

The gene CYP2D6 (Accession number: NM_000106) is a member of the human cytochrome 
P450 (CYP) superfamily. Many members of this family are involved in drug metabolism (see 
for example Curr Drug Metab. 2002 Jun;3(3):289-309. Rodrigues AD, Rushmore TH.), of 
these Cytochrome P450 CYP2D6 is one of the most extensively characterised. It is highly 
polymorphic (more than 70 variations of the gene have been described), and allelic variation 
can result in both increased and decreased enzymatic activity. The CYP2D6 enzyme catalyses 
the metabolism of a large number of clinically important drugs including antidepressants, 
neuroleptics, some antiarrhythmics (Nature 1990 Oct 25;347(6295):773-6 Identification of the 
primary gene defect at the cytochrome P450 CYP2D locus.Gough AC, Miles JS, Spurr NK, 
Moss JE, Gaedigk A, Eichelbaum M, Wolf CR.). 

The gene PTGS2 (Accession number NM_000963) encodes an inducible isozyme of prosta- 
glandin-endoperoxide synthase (prostaglandin-endoperoxide synthase 2). It is also known as 
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COX2 (cyclooxygenase 2). Aberrant methylation of this gene has been identified in lung car- 
cinomas (Cancer Epidemiol Biomarkers Prev 2002 Mar;ll(3):291-7 Hierarchical clustering 
of lung cancer cell lines using DNA methylation marker s.Virmani AK, Tsou J A, Siegmund 
KD 5 Shen LY, Long TI, Laird PW, Gazdar AF 5 Laird-Of&inga IA.). 

The gene CGA (Accession number NM__000735) encodes the alpha polypetptide of glyco- 
protein hormones. Further, it has been identified as an estrogen receptor alpha (ER alpha)- 
responsive gene and overexpression of the gene has been linked to ER positivity in breast 
tumours. Bieche et. ah examined mRNA levels of said gene in 125 ER alpha-positive post- 
menopausal breast cancer patients treated with primary surgery followed by adjuvant tamoxi- 
fen therapy. Initial results indicated significant links between CGA gene overexpression and 
Scarff-Bloom-Richardson histopathological grade I+II and progesterone and estrogen receptor 
positivity, which suggested that CGA is a marker of low tumour aggressiveness ('Identifica- 
tion of CGA as a Novel Estrogen Receptor-responsive Gene in Breast Cancer: An Outstand- 
ing Candidate Marker to predict the Response to Endocrine TherapyCancer Research' 61 , 
1652-1658, February 15, 2001. Ivan Bieche, Beatrice Parfait, Vivianne Le Doussal, Martine 
Olivi, Marie-Christine Rio, Rosette Lidereau and Michel Vidaud). Further mRNA expression 
analysis linked CGA expression levels to Tamoxifen response, it was postulated that when 
combined with analysis of the marker ERBB2 (a marker of poor response) the gene may be 
useful as a predictive marker of tamoxifen responsiveness in breast cancer (Oncogene 2001 
Oct 18;20(47):6955-9 The CGA gene as new predictor of the response to endocrine therapy in 
ER alpha-positive postmenopausal breast cancer patients. Bieche I, Parfait B, Nogues C, An- 
drieu C, Vidaud D, Spyratos F, Lidereau R, Vidaud M.). The authors provided significant data 
associating the expression of the gene CGA with Tamoxifen treatment response. However, 
said analyses have all focused upon the analysis of relative levels of mRNA expression. This 
is not a methodology that is suitable for a medium or high throughput, nor is it a suitable basis 
for the development of a clinical assay. 

The gene PITX2 (NM_000325) encodes the paired-like homeodomain transcription factor 2 
which is known to be expressed during development of anterior structures such as the eye, 
teeth, and anterior pituitary. Although the expression of this gene is associated with cell dif- 
ferentiation and proliferation it has no heretofore recognised role in carcinogenesis or respon- 
siveness to endocrine treatment Toyota et aL, (200L Blood. 97:2823-9.) found hypermeth- 
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ylation of the PITX2 gene in a large proportion of acute myeloid leukemias. Furthermore, this 
hypermethylation is positively correlated to methylation of the ER gene. 

RASSF1 A (Ras association domain family 1 A ) gene is a candidate tumour suppressor gene at 
3p21,3. The Ras GTPases are a superfamily of molecular switches that regulate cellular pro- 
liferation and apoptosis in response to extra-cellular signals. It is purported that RASSF1A is a 
tumour suppressor gene, and epigenetic alterations of this gene have been observed in a vari- 
ety of cancers. Methylation of RASSF1A has been associated with poor prognosis in primary 
non-small cell lung cancer (Kim DH, Kim JS, Ji YI, Shim YM 5 Kim H, Han J 5 Park J., 'Hy- 
permethylation of RASSF1A promoter is associated with the age at starting smoking and a 
poor prognosis in primary non-small cell lung cancer. 5 Cancer Res. 2003 Jul 1;63(13):3743- 
6.). It has also been assocaited with the development of pancreatic cancer (Kuzmin I, Liu L, 
Dammann R ? Geil L, Stanbridge EJ, Wilczynski SP, Lerman MI, Pfeifer GP. Tnactivation of 
RAS association domain family 1A gene in cervical carcinomas and the role of human papil- 
lomavirus infection.' Cancer Res. 2003 Apr 15;63(8):1888-93.), as well as testicular tumours 
and prostate carcinoma amongst others. The application of the methylation of this gene as a 
cancer diagnostic marker has been described in U.S. patent 6,596,488, it does not however 
describe its application in the selection of appropriate treatments regimens for patients. 

Also located within 3p21 is the Dystroglycan precursor gene (Dystrophin-associated glyco- 
protein 1) (NM_ 004393). Dystroglycan (DG, also known as DAG1) is an adhesion molecule 
comprising two subunits namely alpha-DG and beta-DG. The molecule is responsible for cru- 
cial interactions between extracellular matrix and cytoplasmatic compartment and it has been 
hypothesised that as such it may contribute to progression to metastatic disease. Decreased 
expression of this gene has been associated with correlated with higher tumour grade and 
stage in colon, prostate and breast tumours. 

The onecut-2 transcription factor gene (NM_004852) is located at 18q21.31 is a homeo- 
domain transcription factor regulator of liver gene expression in adults and during develop- 
ment. 

The trefoil factor 1 (TFF1) gene (NM__ 003225) encodes a member of the trefoil family of 
proteins. The gene is also known as pS2. They are normally expressed at highest levels in the 
mucosa of the gastrointestinal tract, however they are often expressed ectopically in primary 
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tumours of other tissues, including breast. The expression of TFF1 is regulated by estrogen in 
estrogen-responsive breast cancer cells in culture, its expression is associated with that of the 
estrogen receptor and TFF1 is a marker of hormone responsiveness in tumours (Schwartz et 
aL, 1991. pS2 expression and response to hormonal therapy in patients with advanced breast 
cancer. Cancer Res. 51:624-8). TFF1 promoter methylation has been observed in nonex- 
pressing gastric carcinoma-derived cell lines and tissues. 

TMEFF2 (NM_016192) encodes a transmembrane protein containing an epidermal growth 
factor (EGF)-like motif and two follistatin domains. It has been shown to be overexpressed in 
prostate and brain tissues and it has been suggested that this is an andro gen-regulated gene 
exhibiting antiproliferative effects in prostate cancer cells. 

Methylation of the gene ESR1 (NM_000125), encoding the estrogen receptor has been linked 
to several cancer types including lung, oesophageal, brain and colorectal. The estrogen re- 
ceptor (ESR) is a ligand-activated transcription factor composed of several domains important 
for hormone binding, DNA binding, and activation of transcription. Furthermore, it is the di- 
rect target of the anti-estrogenic compound Tamoxifen. Only tumours expressing estrogen 
receptor (ER+) can respond on Tamoxifen treatment. 

The PCAF (NM__003884) gene encodes the p300/CBP- Associated Factor (PCAF). CBP and 
p300 are large nuclear proteins that bind to many sequence-specific factors involved in cell 
growth and/or differentiation. The p300/CBP associated factor displays in vivo binding activ- 
ity with CBP and p300. The protein has histone acetyl transferase activity with core histones 
and nucleosome core particles, indicating that it plays a direct role in transcriptional regula- 
tion. p300/CBP associated factor also associates with NF-kappa-B p65.This protein has been 
shown to regulate expression of the gene p53 by acetylation of Lys320 in the C-terminal por- 
tion of p53. 

The WBP11 (NM_016312) gene encodes a nuclear protein, which co-localises with mRNA 
splicing factors and intermediate filament-containing perinuclear networks. It contains two 
proline-rich regions that bind to the WW domain of Npw38, a nuclear protein, and thus this 
protein is also called Npw38-binding protein NpwBP. 
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The TBC1 domain family, member 3 gene (TBC1D3, NMJ332258) was discovered originally 
as an oncogene, also known as PRC 17. The gene product contains a GTPase-activating pro- 
tein (GAP) catalytic core motif and interacts directly with Rab5, stimulating its GTP hydroly- 
sis, TBC1D3 is amplified in 15% of prostate cancers and highly overexpressed in approxi- 
mately one-half of metastatic prostate tumors (Pei et al, 2002; Cancer Res. 62:5420-4). 

The CDK6 gene encodes a cyclin-dependent protein kinase regulating major cell cycle transi- 
tions in eukaryotic cells. The cdk6 kinase is associated with cyclins Dl, D2, and D3 and can 
phosphorylate pRB, the product of the retinoblastoma tumor suppressor gene. The activation 
of cdk6 kinase occurs during mid-Gl (Meyerson and Harlow, 1994; Mol Cell Biol. 14:2077- 
86). 

Description 

In the following certain genetic regions are described for whom no genetic nomenclature is 
presently available. In each case the chromosomal location of the genetic sequence is denoted 
within parentetheses ( ) and the genetic sequence is further described by its sequence accord- 
ing to Table 1 . 

The present invention provides methods and nucleic acids for the improved treatment plan- 
ning of patients with cell proliferative disorders of the breast tissues. The aim of the invention 
is achieved by assessment of one or both of two factors of particular relevance to patient 
treatment planning. The first factor is the characterisation of the cell proliferative disorder of 
the breast tissues and/or a metastases thereof in terms of aggresivity, the second factor being 
the prediction of disease free survival and/or response of a subject with said disorder to a 
therapy comprising one or more treatments which target the estrogen receptor pathway or are 
involved in estrogen metabolism, production or secretion. Said treatments include, but are not 
limited to estrogen receptor modulators, estrogen receptor down-regulators, aromatase in- 
hibitors, ovarian ablation, LHRH analogues and other centrally acting drugs influencing es- 
trogen production. 

The prediction of response to a therapeutic regimen comprising one or more treatments which 
target the estrogen receptor pathway or are involved in estrogen metabolism, production or 
secretion (a current treatment of choice as side effects are limited) further enables the physi- 
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cian to determine if additional treatments will be required in addition to or instead of this 
treatment. Treatments which may be used in addition to or instead of said treatment include, 
but are not limited to chemotherapy, radiotherapy, surgery, biological therapy, immunother- 
apy, antibodies and molecularly targeted drugs. 

Characterisation of a breast cancer in terms of its predicted aggressiveness enables the physi- 
cian to make an informed decision as to a therapeutic regimen with appropriate risk and bene- 
fit trade offs to the patient. Aggressiveness is taken to mean one or more of decreased patient 
survival or disease- or relapse-free survival, increased tumor-related complications and faster 
progression of tumor or metastases. According to the aggressiveness of the disease an appro- 
priate treatment or treatments may be selected from the group consisting of chemotherapy, 
radiotherapy, surgery, biological therapy, immunotherapy, antibody treatments, treatments 
involving molecularly targeted drugs, estrogen receptor modulator treatments, estrogen re- 
ceptor down-regulator treatments, aromatase inhibitors treatments, ovarian ablation, treat- 
ments providing LHRH analogues or other centrally acting drugs influencing estrogen pro- 
duction. Wherein a cancer is characterised as 'aggressive' it is particularly preferred that a 
treatment such as, but not limited to, chemotherapy is provided in addition to or instead of an 
endocrine targeting therapy. 

Using the methods and nucleic acids described herein, statistically significant models of pa- 
tient disease free survival and/or responsiveness to treatment and/or disease progression can 
be developed and utilised to assist patients and clinicians in determining suitable treatment 
options to be included in the therapeutic regimen. In one aspect the described method is to be 
used to assess the utility of therapeutic regimens comprising one or more treatments which 
target the estrogen receptor pathway or are involved in estrogen metabolism, production or 
secretion as a therapy for patients suffering from a cell proliferative disorder of the breast tis- 
sues. In particular this aspect of the method enables the physician to determine which treat- 
ments may be used in addition to or instead of said treatment. In a further aspect the described 
method enables the characterisation of the cell proliferative disorder in terms of agressiveness, 
thereby enabling the physician to recommend suitable treatments. Thus, the present invention 
will be seen to reduce the problems associated with present breast cell proliferative disorder 
treatment response prediction methods. 
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Using the methods and nucleic acids as described herein, patient responsiveness can be evalu- 
ated before or during treatment for a cell proliferative disorder of the breast tissues, in order to 
provide critical information to the patient and clinician as to the likely progression of the dis- 
ease. It will be appreciated, therefore, that the methods and nucleic acids exemplified herein 
can serve to improve a patient's quality of life and odds of treatment success by allowing both 
patient and clinician a more accurate assessment of the patient's treatment options. 

The method according to the definition may be used for the improved treatment of all breast 
cell proliferative disorder patients, both pre and post menopausal and independant of their 
node or estrogen receptor status. However, it is particularly preferred that said patients are 
node-negative and estrogen receptor positive. 

The aim of the invention is most preferably achieved by means of the analysis of the meth- 
ylation patterns of one or a combination of genes taken from the group taken from the group 
EGR4, APC, CDKN2A, CSPG2, ERBB2, STMN1, STK11, CA9, PAX6, SFN, S100A2, 
TFF1, TGFBR2, TP53, TP73, PLAU, TMEFF2, ESR1, SYK, HSPB1, RASSF1, TES, 
GRIN2D, PSAT1, CGA, CYP2D6, COX7A2L, ESR2, PLAU, VTN, SULT1A1, PCAF, 
PRKCD, ONECUT2, BCL6, WBP11, (MXl)MXl, APP, ORC4L, NETOl, TBC1D3, GRB7, 
CDK6, SEQ ID NO: 47, SEQ ID NO: 48, ABCA8, SEQ ID NO: 50, SEQ ID NO: 51, 
MARK2, ELK1, Q8WUT3, CGB, BSG, BCKDK, SOX8, DAG1, SEMA4B, and ESR1 
(exon8) (see Table 1) and/or their regulatory regions. 

The invention is characterised in that the nucleic acid of one or a combination of genes taken 
from the group EGR4, APC, CDKN2A, CSPG2, ERBB2, STMN1, STK11, CA9, PAX6, 
SFN, S100A2, TFF1, TGFBR2, TP53, TP73, PLAU, TMEFF2, ESR1, SYK, HSPB1, 
RASSF1, TES, GRIN2D, PSAT1, CGA, CYP2D6, COX7A2L, ESR2, PLAU, VTN, 
SULT1A1, PCAF, PRKCD, ONECUT2, BCL6, WBP11, (MXl)MXl, APP, ORC4L, 
NETOl, TBC1D3, GRB7, CDK6, SEQ ID NO: 47, SEQ ID NO: 48, ABCA8, SEQ ID NO: 
50, SEQ ID NO: 51, MARK2, ELK1, Q8WUT3, CGB, BSG, BCKDK, SOX8, DAG1, 
SEMA4B, and ESR1 (exonS) are contacted with a reagent or series of reagents capable of 
distinguishing between methylated and non methylated CpG dinucleotides within the genomic 
sequence of interest. 
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The present invention makes available a method for the improved treatment and monitoring 
of breast cell proliferative disorders, by enabling the accurate prediction of a patient's disease 
free survival and/or response to treatment with a therapy comprising one or more treatments 
which target the estrogen receptor pathway or are involved in estrogen metabolism, produc- 
tion, or secretion. 

In a particularly preferred embodiment, the method according to the invention enables the 
differentiation between patients who have a high probability of response to said therapy and 
those who have a low probability of response to said therapy or a methylation characterisitc 
predicted disease free survival time, in addition to the characterisation of tumors in terms of 
aggresiveness. 

The method according to the invention may be used for the analysis of a wide variety of cell 
proliferative disorders of the breast tissues including, but not limited to, ductal carcinoma in 
situ, invasive ductal carcinoma, invasive lobular carcinoma, lobular carcinoma in situ, come- 
docarcinoma, inflammatory carcinoma, mucinous carcinoma, scirrhous carcinoma, colloid 
carcinoma, tubular carcinoma, medullary carcinoma, metaplastic carcinoma, and papillary 
carcinoma and papillary carcinoma in situ, undifferentiated or anaplastic carcinoma and Pa- 
get' s disease of the breast. 

The method according to the invention is particularly suited to the prediction of response to 
the aforementioned therapy in two treatment settings. In one embodiment, the method is ap- 
plied to patients who receive endocrine pathway targeting treatment as secondary treatment to 
an initial non chemotherapeutical therapy, e.g. surgery (hereinafter referred to as the adjuvant 
setting) as illustrated in Figure 1. Such a treatment is often prescribed to patients suffering 
from Stage 1 to 3 breast carcinomas. In this embodiment patients disease free survival times 
are predicted according to their by detecting patients with worse disease free survival times 
the physician may choose to recommend the patient for farther treatment, instead of or in ad- 
dition to the endocrine targetting therapy(s), in particular but not limited to, chemotherapy. 
In a further preferred embodiment said method is applied to patients suffering from a relapse 
of breast cancer following treatment by a primary means (preferably surgery) followed by a 
disease free period, and wherein the endocrine pathway targeting treatment has been pre- 
scribed in response to a detection of a relapse of the carcinoma. Such a treatment is often pre- 
scribed to patients suffering from later stage carcinomas, particularly wherein metastasis has 
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occurred. Therefore this clinical setting shall also hereinafter be referred to as the 'metastatic 
setting'. In this embodiment responders are those who enter partial or complete remission i.e. 
subjects whose cancer recedes to undetectable levels as opposed to those whose diseases fur- 
ther metastasise or remain above detectable levels. By detecting patients whose cancers are 
likely to metastasis the physician may choose to recommend the patient for further treatment, 
instead of or in addition to the endocrine targetting therapy(s), in particular but not limited to, 
chemotherapy. 

This methodology presents further improvements over the state of the art in that the method 
may be applied to any subject, independent of the estrogen and/or progesterone receptor 
status. Therefore in a preferred embodiment, the subject is not required to have been tested for 
estrogen or progesterone receptor status. 

The object of the invention is achieved by means of the analysis of the methylation patterns of 
one or more of the genes EGR4, APC, CDKN2A, CSPG2, ERBB2, STMN1, STK11, CA9, 
PAX6, SFN, S100A2, TFF1, TGFBR2, TP53, TP73, PLAU, TMEFF2, ESR1 5 SYK, HSPB1, 
RASSF1, TES, GRIN2D, PSAT1, CGA, CYP2D6, COX7A2L, ESR2, PLAU, VTN, 
SULT1A1, PCAF, PRKCD, ONECUT2, BCL6, WBP11, (MXl)MXl, APP, ORC4L, 
NETOl, TBC1D3, GRB7, CDK6, SEQ ID NO: 47, SEQ ID NO: 48, ABCA8, SEQ ID NO: 
50, SEQ ID NO: 51, MARK2, ELK1, Q8WUT3, CGB, BSG, BCKDK, SOX8, DAG1, 
SEMA4B, ESR1 (exon8) and/or their regulatory regions. In a particularly preferred embodi- 
ment the sequences of said genes comprise SEQ ID NOs: 1-61 and sequences complementary 
thereto. 

The object of the invention may also be achieved by analysing the methylation patterns of one 
or more genes taken from the following subsets of said aforementioned group of genes. In one 
embodiment the object of the invention is the prediction of disease free survival and/or prob- 
ability of response to a treatment which targets the estrogen receptor pathway or are involved 
in estrogen metabolism, production or secretion. This is achieved by analysis of the methyla- 
tion patterns of one or more genes taken from the group consisting ERBB2, STMN1, TFF1, 
TMEFF2, ESR1, HSPB1, PITX2, COX7A2L, PLAU, VTN, PCAF, ONECUT2, BCL6, 
WBP1 1, TBC1D3, GRB7, CDK6, SEQ ID NO: 47, ABCA8 and SEQ ID NO: 51 and wherein 
it is further preferred that the sequence of said genes comprise SEQ ID NOs: 5, 6, 12, 17, 18, 
20, 23, 28, 16, 31, 33, 35, 36, 37, 43, 44, 46, 47, 49 and 51, respectively, according to Table 1. 
It is preferred that said gene is PITX2. 
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It is preferred that the object of the invention is achieved by analysing the methylation pat- 
terns of a plurality of genes, hereinafter also referred to as a gene panel. It is further preferred 
that said plurality is between two and four genes. PITX2In one embodiment said gne panle 
consists of PITX2, TBC1D3 and CDK6. It is particularly preferred that said gene panel of 
genes is selected from the group consisting TFF1 and PLAU; TFF1 and PLAU and PITX2; 
PITX2 and TFF1; PITX2 and PLAU. Further preferred is the gene panel of TFF1 and PITX2 
for the prediction of disease free survival or metastasis in treated patients. 

In a further embodiment the object of the invention is the characterisation of the tumor in 
terms of aggresiveness. This is achieved by analysis of the methylation patterns of one or 
more genes taken from the group consisting APC, CSPG2, ERBB2, STK11, S100A2, TFF1, 
TGFBR2, TP53, TMEFF2, SYK, HSPB1, RASSF1, PSAT1, CGA 5 ESR2, ONECUT2 5 
WBP11, CYP2D6, CDK6, ELK1, CGB and DAG1 , and wherein it is further preferred that 
the sequence of said genes comprise SEQ ID NOs: 2, 4, 5, 7, 11, 12, 13, 14, 17, 19, 20, 21, 
25, 26, 29, 35, 37, 45, 46, 53, 55 and 59, respectively, according to Table 1. 

In a preferred embodiment said method is achieved by contacting said nucleic acid sequences 
in a biological sample obtained from a subject with at least one reagent or a series of reagents, 
wherein said reagent or series of reagents, distinguishes between methylated and non methyl- 
ated CpG dinucleotides within the target nucleic acid. 

In a preferred embodiment, the method comprises the following steps: Preferably, said 

.r 

method comprises the following steps: In the first step, a sample of the tissue to be analysed is 
obtained. The source may be any suitable source, such as cell lines, histological slides, paraf- 
fin embedded tissues, biopsies, tissue embedded in paraffin, bodily fluids, urine, blood and all 
possible combinations thereof. In a particularly preferred embodiment of the method said 
source is bodily fluids urine, or blood. The DNA is then isolated from the sample. Extraction 
may be by means that are standard to one skilled in the art, including the use of commercially 
available kits, detergent lysates, sonification and vortexing with glass beads. Briefly, wherein 
the DNA of interest is encapsulated by a cellular membrane the biological sample must be 
disrupted and lysed by enzymatic, chemical or mechanical means. The DNA solution may 
then be cleared of proteins and other contaminants e.g. by digestion with proteinase K. The 
genomic DNA is then recovered from the solution. This may be carried out by means of a 
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variety of methods including salting out, organic extraction or binding of the DNA to a solid 
phase support. The choice of method will be affected by several factors including time, ex- 
pense and required quantity of DNA. Once the nucleic acids have been extracted, the genomic 
double stranded DNA is used in the analysis. 

In the second step of the method, the genomic DNA sample is treated in such a manner that 
cytosine bases which are unmethylated at the 5 '-position are converted to uracil, thymine, or 
another base which is dissimilar to cytosine in terms of hybridization behavior. This will be 
understood as 'pretreatment' herein. 

The above-described treatment of genomic DNA is preferably carried out with bisulfite (hy- 
drogen sulfite, disulfite) and subsequent alkaline hydrolysis that results in a conversion of 
non-methylated cytosine nucleobases to uracil or to another base that is dissimilar to cytosine 
in terms of base pairing behavior. 

In the third step of the method, fragments of the pretreated DNA are amplified, using sets of 
primer oligonucleotides according to the present invention, and an amplification enzyme. The 
amplification of several DNA segments can be carried out simultaneously in one and the same 
reaction vessel. Typically, the amplification is carried out using a polymerase chain reaction 
(PGR). The set of primer oligonucleotides includes at least two oligonucleotides whose se- 
quences are each reverse complementary, identical, or hybridize under stringent or highly 
stringent conditions to an at least 16-base-pair long segment of the base sequences of one or 
more of SEQ ID NO 206 to 449 and sequences complementary thereto. 

In an alternate embodiment of the method, the methylation status of preselected CpG posi- 
tions within the nucleic acid sequences comprising one or more of SEQ ID NO 1 to 6 1 may 
be detected by use of methylation- specific primer oligonucleotides. This technique (MSP) has 
been described in United States Patent No. 6,265,171 to Herman. The use of methylation 
status specific primers for the amplification of bisulfite treated DNA allows the differentiation 
between methylated and unmethylated nucleic acids. MSP primers pairs contain at least one 
primer that hybridizes to a bisulfite treated CpG dinucleotide. Therefore, the sequence of said 
primers comprises at least one CpG dinucleotide. MSP primers specific for non-methylated 
DNA contain a "T 5 at the 3 1 position of the C position in the CpG. Preferably, therefore, the 
base sequence of said primers is required to comprise a sequence having a length of at least 9 
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nucleotides which hybridizes to a pretreated nucleic acid sequence according to one of SEQ 
ID NO 206-449 and sequences complementary thereto, wherein the base sequence of said 
oligomers comprises at least one CpG dinucleotide. 

Wherein the method is for the prediction of probability of disease free survival and/or re- 
sponse to a treatment which targets the estrogen receptor pathway or are involved in estrogen 
metabolism, production or secretion it is particularly preferred that said nucleotide se- 
quence^) hybridizes to a pretreated nucleic acid sequence according to one of SEQ ID NO 
70, 71, 192, 193, 72, 73, 194, 195, 84, 85, 206, 207, 94, 95, 216, 217, 96, 97, 218, 219, 100, 
101, 222, 223, 106, 107, 228, 229, 116, 117, 238, 239, 92, 93, 214, 215, 122, 123, 244, 245, 
126, 127, 248, 249, 130, 131, 252, 253, 132, 133, 254, 255, 134, 135, 256, 257, 146, 147, 268, 
269, 148, 149, 270, 271, 152, 153, 274, 275, 154, 155, 276, 277, 158, 159, 280, 281, 162, 163, 
284 and 285 said contiguous nucleotides comprising at least one CpG, TpG or CpA dinucleo- 
tide sequence. 

Wherein the method is for the the characterisation of the breast cell proliferative disorder in 
terms of aggresiveness it is particularly preferred that said nucleotide sequence(s) hybridizes 
to a pretreated nucleic acid sequence according to one of SEQ ID NO 64, 65, 186, 187, 68, 
69, 190, 191, 70, 71, 192, 193, 74, 75, 196, 197, 82, 83, 204, 205, 84, 85, 206, 207, 86, 87, 
208, 209, 88, 89, 210, 211, 94, 95, 216, 217, 98, 99, 220, 221, 100, 101, 222, 223, 102, 103, 
224, 225, 110, 111,232, 233, 112, 113,234, 235, 118, 119, 240, 241, 130, 131,252, 253, 134, 
135, 256, 257, 150, 151, 272, 273, 152, 153, 274, 275, 166, 167, 288, 289, 170, 171, 292, 293, 
178, 179, 300, 301, 148, 149, 270, 271, 150, 151, 272, 273, 152, 153, 274, 275, 154, 155, 276, 
277, 156, 157, 278, 279, 158, 159, 280, 281, 160, 161, 282, 283, 162, 163, 284, 285, 164, 165, 
286, 287, 166, 167, 288, 289, 168, 169, 290, 291, 170, 171, 292, 293, 172, 173, 294, 295, 174, 
175, 296, 297, 176, 177, 298, 299, 178, 179, 300, 301, 180, 181, 302, 303, 182, 183, 304 and 
305, said contiguous nucleotides comprising at least one CpG, TpG or CpA dinucleotide se- 
quence. 

A further preferred embodiment of the method comprises the use of blocker oligonucleotides. 
The use of such blocker oligonucleotides has been described by Yu et aL, BioTechniqu.es 
23:714-720, 1997. Blocking probe oligonucleotides are hybridized to the bisulfite treated nu- 
cleic acid concurrently with the PGR primers. PCR amplification of the nucleic acid is termi- 
nated at the 5' position of the blocking probe, such that amplification of a nucleic acid is sup- 
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pressed where the complementary sequence to the blocking probe is present. The probes may 
be designed to hybridize to the bisulfite treated nucleic acid in a methylation status specific 
manner. For example, for detection of methylated nucleic acids within a population of un- 
methylated nucleic acids, suppression of the amplification of nucleic acids which are unmeth- 
ylated at the position in question would be carried out by the use of blocking probes com- 
prising a 'CpA' or 'TpA' at the position in question, as opposed to a c CpG' if the suppression 
of amplification of methylated nucleic acids is desired. 

For PGR methods using blocker oligonucleotides, efficient disruption of polymerase-mediated 
amplification requires that blocker oligonucleotides not be elongated by the polymerase. Pref- 
erably, this is achieved through the use of blockers that are 3 5 -deoxy oligonucleotides, or oli- 
gonucleotides derivitized at the 3' position with other than a "free" hydroxyl group. For ex- 
ample, 3'-0-acetyl oligonucleotides are representative of a preferred class of blocker mole- 
cule. 

Additionally, polymerase-mediated decomposition of the blocker oligonucleotides should be 
precluded. Preferably, such preclusion comprises either use of a polymerase lacking 5'-3 v 
exonuclease activity, or use of modified blocker oligonucleotides having, for example, thioate 
bridges at the 5'-terminii thereof that render the blocker molecule nuclease-resistant. Particu- 
lar applications may not require such 5' modifications of the blocker. For example, if the 
blocker- and primer-binding sites overlap, thereby precluding binding of the primer (e.g., with 
excess blocker), degradation of the blocker oligonucleotide will be substantially precluded. 
This is because the polymerase will not extend the primer toward, and through (in the 5' -3' 
direction) the blocker - a process that normally results in degradation of the hybridized 
blocker oligonucleotide. 

A particularly preferred blocker/PCR embodiment, for purposes of the present invention and 
as implemented herein, comprises the use of peptide nucleic acid (PNA) oligomers as block- 
ing oligonucleotides. Such PNA blocker oligomers are ideally suited, because they are neither 
decomposed nor extended by the polymerase. Preferably, therefore, the base sequence of said 
blocking oligonucleotides is required to comprise a sequence having a length of at least 9 nu- 
cleotides which hybridizes to a pretreated nucleic acid sequence according to one of SEQ ID 
NO 206-449 9 and sequences complementary thereto, wherein the base sequence of said oli- 
gonucleotides comprises at least one CpG, TpG or CpA dinucleotide. 
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Wherein the method is for the prediction of probability of disease free survival and/or re- 
sponse to a treatment which targets the estrogen receptor pathway or are involved in estrogen 
metabolism, production or secretion it is particularly preferred that said nucleotide se- 
quence^) hybridizes to a pretreated nucleic acid sequence according to one of SEQ ID NO 
70, 71, 192, 193, 72, 73, 194, 195, 84, 85, 206, 207, 94, 95, 216, 217, 96, 97, 218, 219, 100, 
101, 222, 223, 106, 107, 228, 229, 116, 117, 238, 239, 92, 93, 214, 215, 122, 123, 244, 245, 
126, 127, 248, 249, 130, 131, 252, 253, 132, 133, 254, 255, 134, 135, 256, 257, 146, 147, 268, 
269, 148, 149, 270, 271, 152, 153, 274, 275, 154, 155, 276, 277, 158, 159, 280, 281, 162, 163, 
284 and 285, said contiguous nucleotides comprising at least one CpG, TpG or CpA dinu- 
cleotide sequence. 

Wherein the method is for the the characterisation of the breast cell proliferative disorder in 
terms of aggresiveness it is particularly preferred that said nucleotide sequence(s) hybridizes 
to a pretreated nucleic acid sequence according to one of SEQ ID NO 64, 65, 186, 187, 68, 
69, 190, 191, 70, 71, 192, 193, 74, 75, 196, 197, 82, 83, 204, 205, 84, 85, 206, 207, 86, 87, 
208, 209, 88, 89, 210, 211, 94, 95, 216, 217, 98, 99, 220, 221, 100, 101, 222, 223, 102, 103, 
224, 225, 110, 111,232, 233, 112, 113,234, 235, 118, 119, 240, 241, 130, 131,252, 253, 134, 
135, 256, 257, 150, 151, 272, 273, 152, 153, 274, 275, 166, 167, 288, 289, 170, 171, 292, 293, 
178, 179, 300, 301, 148, 149, 270, 271, 150, 151, 272, 273, 152, 153, 274, 275, 154, 155, 276, 
277, 156, 157, 278, 279, 158, 159, 280, 281, 160, 161, 282, 283, 162, 163, 284, 285, 164, 165, 
286, 287, 166, 167, 288, 289, 168, 169, 290, 291, 170, 171, 292, 293, 172, 173, 294, 295, 174, 
175, 296, 297, 176, 177, 298, 299, 178, 179, 300, 301, 180, 181, 302, 303, 182, 183, 304 and 
305, said contiguous nucleotides comprising at least one CpG, TpG or CpA dinucleotide se- 
quence. 

The fragments obtained by means of the amplification can carry a directly or indirectly de- 
tectable label. Preferred are labels in the form of fluorescence labels, radionuclides, or detach- 
able molecule fragments having a typical mass that can be detected in a mass spectrometer. 
Where said labels are mass labels, it is preferred that the labeled amplificates have a single 
positive or negative net charge, allowing for better detectability in the mass spectrometer. The 
detection may be carried out and visualized by means of, e.g., matrix assisted laser desorp- 
tion/ionization mass spectrometry (MALDI) or using electron spray mass spectrometry (ESI). 
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Matrix Assisted Laser Desorption/Ionization Mass Spectrometry (MALDI-TOF) is a very 
efficient development for the analysis of biomolecules (Karas and Hillenkamp, Anal Chem., 
60:2299-301, 1988). An analyte is embedded in a light-absorbing matrix. The matrix is 
evaporated by a short laser pulse thus transporting the analyte molecule into the vapour phase 
in an unfragrnented manner. The analyte is ionized by collisions with matrix molecules. An 
applied voltage accelerates the ions into a field-free flight tube. Due to their different masses, 
the ions are accelerated at different rates. Smaller ions reach the detector sooner than bigger 
ones. MALDI-TOF spectrometry is well suited to the analysis of peptides and proteins. The 
analysis of nucleic acids is somewhat more difficult (Gut and Beck, Current Innovations and 
Future Trends, 1:147-57, 1995). The sensitivity with respect to nucleic acid analysis is ap- 
proximately 100-times less than for peptides, and decreases disproportionally with increasing 
fragment size. Moreover, for nucleic acids having a multiply negatively charged backbone, 
the ionization process via the matrix is considerably less efficient. In MALDI-TOF spec- 
trometry, the selection of the matrix plays an eminently important role. For desorption of 
peptides, several very efficient matrixes have been found which produce a very fine crystalli- 
sation. There are now several responsive matrixes for DNA, however, the difference in sensi- 
tivity between peptides and nucleic acids has not been reduced. This difference in sensitivity 
can be reduced, however, by chemically modifying the DNA in such a manner that it becomes 
more similar to a peptide. For example, phosphorothioate nucleic acids, in which the usual 
phosphates of the backbone are substituted with thiophosphates, can be converted into a 
charge-neutral DNA using simple alkylation chemistry (Gut and Beck, Nucleic Acids Res. 23: 
1367-73, 1995). The coupling of a charge tag to this modified DNA results in an increase in 
MALDI-TOF sensitivity to the same level as that found for peptides. A further advantage of 
charge tagging is the increased stability of the analysis against impurities, which makes the 
detection of unmodified substrates considerably more difficult. 

In the fourth step of the method, the amplificates obtained during the third step of the method 
are analysed in order to ascertain the methylation status of the CpG dinucleotides prior to the 
treatment. 

In embodiments where the amplificates were obtained by means of MSP amplification, the 
presence or absence of an amplificate is in itself indicative of the methylation state of the CpG 
positions covered by the primer, according to the base sequences of said primer. 
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Amplificates obtained by means of both standard and methylation specific PCR may be fur- 
ther analyzed by means of hybridization-based methods such as, but not limited to ? array 
technology and probe based technologies as well as by means of techniques such as sequenc- 
ing and template directed extension. 

In one embodiment of the method, the amplificates synthesised in step three are subsequently 
hybridized to an array or a set of oligonucleotides and/or PNA probes. In this context, the 
hybridization takes place in the following manner: the set of probes used during the hybridi- 
zation is preferably composed of at least 2 oligonucleotides or PNA-oligomers; in the process, 
the amplificates serve as probes which hybridize to oligonucleotides previously bonded to a 
solid phase; the non-hybridized fragments are subsequently removed; said oligonucleotides 
contain at least one base sequence having a length of at least 9 nucleotides which is reverse 
complementary or identical to a segment of the base sequences specified in the present Se- 
quence Listing; and the segment comprises at least one CpG , TpG or CpA dinucleotide. 

In a preferred embodiment, said dinucleotide is present in the central third of the oligomer. 
For example, wherein the oligomer comprises one CpG dinucleotide, said dinucleotide is 
preferably the fifth to ninth nucleotide from the 5 5 -end of a 13-mer. One oligonucleotide ex- 
ists for the analysis of each CpG dinucleotide within the sequence according to SEQ ID NO 1 
to 61, and the equivalent positions within SEQ ID NO 206-449 (according to Table 1). Said 
oligonucleotides may also be present in the form of peptide nucleic acids. The non-hybridized 
amplificates are then removed.The hybridized amplificates are then detected. In this context, 
it is preferred that labels attached to the amplificates are identifiable at each position of the 
solid phase at which an oligonucleotide sequence is located. 

In yet a further embodiment of the method, the genomic methylation status of the CpG posi- 
tions may be ascertained by means of oligonucleotide probes that are hybridised to the bisul- 
fite treated DNA concurrently with the PCR amplification primers (wherein said primers may 
either be methylation specific or standard). 

A particularly preferred embodiment of this method is the use of fluorescence-based Real 
Time Quantitative PCR (Heid et al., Genome Res. 6:986-994, 1996; also see United States 
Patent No. 6,331,393) employing a dual-labeled fluorescent oligonucleotide probe 
(TaqMan™ PCR, using an ABI Prism 7700 Sequence Detection System, Perkin Elmer Ap- 
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plied Biosystems, Foster City, California). The TaqMan™ PGR reaction employs the use of a 
nonextendible interrogating oligonucleotide, called a TaqMan™ probe, which, in preferred 
imbodiments, is designed to hybridize to a GpC-rich sequence located between the forward 
and reverse amplification primers. The TaqMan™ probe further comprises a fluorescent "re- 
porter moiety" and a "quencher moiety" covalently bound to linker moieties (e.g., phospho- 
ramidites) attached to the nucleotides of the TaqMan™ oligonucleotide. For analysis of meth- 
ylation within nucleic acids subsequent to bisulfite treatment, it is required that the probe be 
methylation specific, as described in United States Patent No. 6,331,393, (hereby incorporated 
by reference in its entirety) also known as the MethylLight™ assay. Variations on the 
TaqMan™ detection methodology that are also suitable for use with the described invention 
include the use of dual-probe technology (Lightcycler™) or fluorescent amplification primers 
(Sunrise™ technology). Both these techniques may be adapted in a manner suitable for use 
with bisulfite treated DNA, and moreover for methylation analysis within CpG dinucleotides. 

A further suitable method for the use of probe oligonucleotides for the assessment of meth- 
ylation by analysis of bisulfite treated nucleic acids In a further preferred embodiment of the 
method, the fifth step of the method comprises the use of template-directed oligonucleotide 
extension, such as MS-SNuPE as described by Gonzalgo and Jones, Nucleic Acids Res. 
25:2529-2531, 1997. 

In yet a further embodiment of the method, the fifth step of the method comprises sequencing 
and subsequent sequence analysis of the amplificate generated in the third step of the method 
(Sanger F., et al, Proc Natl Acad Sci USA 74:5463-5467, 1977). 

In one preferred embodiment of the method the nucleic acids according to SEQ ID NO 1 to 
61, are isolated and treated according to the first three steps of the method outlined above, 
namely: 

a) obtaining, from a subject, a biological sample having subject genomic DNA; 

b) extracting or otherwise isolating the genomic DNA; and 

c) treating the genomic DNA of b), or a fragment thereof, with one or more reagents to con- 
vert cytosine bases that are unmethylated in the 5-position thereof to uracil or to another base 
that is detectably dissimilar to cytosine in terms of hybridization properties; 

and wherein the subsequent amplification of d) is carried out in a methylation specific man- 
ner, namely by use of methylation specific primers or blocking oligonucleotides, and further 
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wherein the detection of the amplificates is carried out by means of a real-time detection 
probes, as described above. 

Wherein the subsequent amplification of d) is carried out by means of methylation specific 
primers, as described above, said methylation specific primers comprise a sequence having a 
length of at least 9 nucleotides which hybridizes to a pretreated nucleic acid sequence ac- 
cording to one of SEQ ID NO 206-449, and sequences complementary thereto, wherein the 
base sequence of said oligomers comprises at least one CpG dinucleotide. 

Wherein the method is for the prediction of disease free survival and/or probability of re- 
sponse to a treatment which targets the estrogen receptor pathway or are involved in estrogen 
metabolism, production or secretion it is particularly preferred that said blocking oligonucleo- 
tide nucleotide sequence(s) hybridizes to a pretreated nucleic acid sequence according to one 
of one of SEQ ID NO 70, 71, 192, 193, 72, 73, 194, 195, 84, 85, 206, 207, 94, 95, 216, 217, 
96, 97,218,219, 100, 101,222, 223, 106, 107, 228, 229, 116, 117, 238, 239, 92 5 93,214,215, 
122, 123, 244, 245, 126, 127, 248, 249, 130, 131, 252, 253, 132, 133, 254, 255, 134, 135, 256, 
257, 146, 147, 268, 269, 148, 149, 270, 271, 152, 153, 274, 275, 154, 155, 276, 277, 158, 159, 
280, 281, 162, 163, 284 and 285, said contiguous nucleotides comprising at least one CpG, 
TpG or CpA dinucleotide sequence. 

Wherein the method is for the characterisation of the breast cell proliferative disorder in terms 
of aggresiveness it is particularly preferred that said blocking oligonucleotide nucleotide se- 
quence^) hybridizes to a pretreated nucleic acid sequence according to one of SEQ ID NO 
64, 65, 186, 187, 68, 69, 190, 191, 70, 71, 192, 193, 74, 75, 196, 197, 82, 83, 204, 205, 84, 85, 
206, 207, 86, 87, 208, 209, 88, 89, 210, 211, 94, 95, 216, 217, 98, 99, 220, 221, 100, 101, 222, 
223, 102, 103,224, 225, 110, 111,232, 233, 112, 113,234, 235, 118, 119, 240, 241, 130, 131, 
252, 253, 134, 135, 256, 257, 150, 151, 272, 273, 152, 153, 274, 275, 166, 167, 288, 289, 170, 
171, 292, 293, 178, 179, 300, 301, 148, 149, 270, 271, 150, 151, 272, 273, 152, 153, 274, 275, 
154, 155, 276, 277, 156, 157, 278, 279, 158, 159, 280, 281, 160, 161, 282, 283, 162, 163, 284, 
285, 164, 165, 286, 287, 166, 167, 288, 289, 168, 169, 290, 291, 170, 171, 292, 293, 172, 173, 
294, 295, 174, 175, 296, 297, 176, 177, 298, 299, 178, 179, 300, 301, 180, 181, 302, 303, 182, 
183, 304 and 305, said contiguous nucleotides comprising at least one CpG, TpG or CpA di- 
nucleotide sequence. 
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Step e) of the method, namely the detection of the specific amplificates indicative of the 
methylation status of one or more CpG positions according to SEQ ID NO 1 to 61 is carried 
out by means of real-time detection methods as described above* 

In an alternative most preferred embodiment of the method the subsequent amplification of d) 
is carried out in the presence of blocking oligonucleotides, as described above. Said blocking 
oligonucleotides comprising a sequence having a length of at least 9 nucleotides which hy- 
bridizes to a pretreated nucleic acid sequence according to one of SEQ ID NO 206-449 and 
sequences complementary thereto, wherein the base sequence of said oligomers comprises at 
least one CpG, TpG or CpA dinucleotide. Step e) of the method, namely the detection of the 
specific amplificates indicative of the methylation status of one or more CpG positions ac- 
cording to SEQ ID NO 206-449 is carried out by means of real-time detection methods as 
described above. 

In a further preferred embodiment of the method the nucleic acids according to SEQ ID NO 1 
to 6 1 are isolated and treated according to the first three steps of the method outlined above, 
namely: 

a) obtaining, from a subject, a biological sample having subject genomic DNA; 

b) extracting or otherwise isolating the genomic DNA; 

c) treating the genomic DNA of b), or a fragment thereof, with one or more reagents to con- 
vert cytosine bases that are unmethylated in the 5 -position thereof to uracil or to another base 
that is detectably dissimilar to cytosine in terms of hybridization properties; and wherein 

d) amplifying subsequent to treatment in c) is carried out in a methylation specific manner, 
namely by use of methylation specific primers or blocking oligonucleotides, and further 
wherein 

e) detecting of the amplificates is carried out by means of a real-time detection probes, as de- 
scribed above. 

Wherein the subsequent amplification of c) is carried out by means of methylation specific 
primers, as described above, said methylation specific primers comprise a sequence having a 
length of at least 9 nucleotides which hybridizes to a pretreated nucleic acid sequence ac- 
cording to one of SEQ ID NO 206-449 and sequences complementary thereto, wherein the 
base sequence of said oligomers comprises at least one CpG dinucleotide. Wherein the 
method is for the prediction of disease free survival and/or probability of response to a treat- 
ment which targets the estrogen receptor pathway or are involved in estrogen metabolism, 
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production or secretion it is particularly preferred that said methylation specific primers hy- 
bridize to a pretreated nucleic acid sequence according to one of one of SEQ ID NO 70, 71 , 
192, 193, 72, 73, 194, 195, 84, 85, 206, 207, 94, 95, 216, 217, 96, 97, 218, 219, 100, 101, 222, 
223, 106, 107, 228, 229, 116, 117, 238, 239, 92, 93, 214, 215, 122, 123, 244, 245, 126, 127, 
248, 249, 130, 131, 252, 253, 132, 133, 254, 255, 134, 135, 256, 257, 146, 147, 268, 269, 148, 
149, 270, 271, 152, 153, 274, 275, 154, 155, 276, 277, 158, 159, 280, 281, 162, 163, 284 and 
285, said contiguous nucleotides comprising at least one CpG, TpG or CpA dinucleotide se- 
quence. 

Wherein the method is for the characterisation of the breast cell proliferative disorder in terms 
of aggresiveness it is particularly preferred that said methylation specific primers hybridize to 
a pretreated nucleic acid sequence according to one of SEQ ID NO 64, 65, 186, 187, 68, 69, 
190, 191, 70, 71, 192, 193, 74, 75, 196, 197, 82, 83, 204, 205, 84, 85, 206, 207, 86, 87, 208, 
209, 88, 89, 210, 211, 94, 95, 216, 217, 98, 99, 220, 221, 100, 101, 222, 223, 102, 103, 224, 
225, 110, 111,232, 233, 112, 113,234, 235, 118, 119, 240, 241, 130, 131,252, 253, 134, 135, 
256, 257, 150, 151, 272, 273, 152, 153, 274, 275, 166, 167, 288, 289, 170, 171, 292, 293, 178, 
179, 300, 301, 148, 149, 270, 271, 150, 151, 272, 273, 152, 153, 274, 275, 154, 155, 276, 277, 
156, 157, 278, 279, 158, 159, 280, 281, 160, 161, 282, 283, 162, 163, 284, 285, 164, 165, 286, 
287, 166, 167, 288, 289, 168, 169, 290, 291, 170, 171, 292, 293, 172, 173, 294, 295, 174, 175, 
296, 297, 176, 177, 298, 299, 178, 179, 300, 301, 180, 181, 302, 303, 182, 183, 304 and 305, 
said contiguous nucleotides comprising at least one CpG, TpG or CpA dinucleotide sequence. 

Additional embodiments of the invention provide a method for the analysis of the methylation 
status of genomic DNA according to the invention (SEQ ID NO 1 to 61), and complements 
thererof) without the need for pretreatment. 

Wherein the method is for the prediction of disease free survival and/or probability of re- 
sponse to a treatment which targets the estrogen receptor pathway or are involved in estrogen 
metabolism, production or secretion it is particularly preferred that said genomic sequences 
are selected from SEQ ID NO 5, 6, 12, 17, 18, 20, 23, 28, 16, 31, 33, 35, 36, 37, 43, 44, 46, 
47, 49 and 51. 



WO 2005/059172 



-30- 



PCT/EP2004/014170 



Wherein the method is for the characterisation of the breast cell proliferative disorder in terms 
of aggresiveness it is particularly preferred that said genomic sequences are selected from 
SEQ ID NO 2, 4, 5, 7, 11, 12, 13, 14, 17, 19, 20, 21, 25, 26, 29, 35, 37, 45, 46, 53, 55 and 59. 

In the first step of such additional embodiments, the genomic DNA sample is isolated from 
tissue or cellular sources. Preferably, such sources include cell lines, histological slides, paraf- 
fin embedded tissues, body fluids, or tissue embedded in paraffin. In the second step, the ge- 
nomic DNA is extracted. Extraction may be by means that are standard to one skilled in the 
art, including but not limited to the use of detergent lysates, sonification and vortexing with 
glass beads. Once the nucleic acids have been extracted, the genomic double-stranded DNA is 
used in the analysis. 

In a preferred embodiment, the DNA may be cleaved prior to the treatment, and this may be 
by any means standard in the state of the art, in particular with methylation-sensitive restric- 
tion endonucleases. 

In the third step, the DNA is then digested with one or more methylation sensitive restriction 
enzymes. The digestion is carried out such that hydrolysis of the DNA at the restriction site is 
informative of the methylation status of a specific CpG dinucleotide. 

In the fourth step, which is optional but a preferred embodiment, the restriction fragments are 
amplified. This is preferably carried out using a polymerase chain reaction, and said amplifi- 
cates may carry suitable detectable labels as discussed above, namely fluorophore labels, ra- 
dionucleotides and mass labels. 

In the fifth step the amplificates are detected. The detection may be by any means standard in 
the art, for example, but not limited to, gel electrophoresis analysis, hybridization analysis, 
incorporation of detectable tags within the PGR products, DNA array analysis, MALDI or 
ESI analysis. 



When the methylation status of the selected CpG positions have been ascertained patient 
treatment relevant parameters can be ascertained wherein hypermethylation of the genes is 
associated with poor prognosis of said subject, aggressive characteristics of said cell prolif- 
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erative disorder, poor disease free survival and/or lower probability of response of said sub- 
ject to said treatment as relative to individuals with hypomethylation. 

The term "hypermethylation" refers to the average methylation state corresponding to an in- 
creased (above average or median) presence of 5-mCyt at one or a plurality of CpG dinucleo- 
tides within a DNA sequence of a test DNA sample, relative to the amount of 5-mCyt found 
at corresponding CpG dinucleotides within a control DNA sample. 

The term "hypomethylation" refers to the average methylation state corresponding to a de- 
creased (below average or median) presence of 5-mCyt at one or a plurality of CpG dinu- 
cleotides within a DNA sequence of a test DNA sample, relative to the amount of 5-mCyt 
found at corresponding CpG dinucleotides within a control DNA sample. 

Kits 

Moreover, an additional aspect of the present invention is a kit comprising, for example: a 
bisulfite-containing reagent; a set of primer oligonucleotides containing at least two oligonu- 
cleotides whose sequences in each case correspond, are complementary, or hybridize under 
stringent or highly stringent conditions to a 16-base long segment of the sequences SEQ ID 
NO: 1 to 61 and 206-449; oligonucleotides and/or PNA-oligomers; as well as instructions for 
carrying out and evaluating the described method. In a further preferred embodiment, said kit 
may further comprise standard reagents for performing a CpG position-specific methylation 
analysis, wherein said analysis comprises one or more of the following techniques: MS- 
SNuPE, MSP, MethyLight ™, HeavyMethyl™ , COBRA, and nucleic acid sequencing. How- 
ever, a kit along the lines of the present invention can also contain only part of the aforemen- 
tioned components. 

Typical reagents (e.g., as might be found in a typical MethyLight dbased kit) for Meth- 
yLight Analysis may include, but are not limited to: PCR primers for specific gene (or meth- 
ylation-altered DNA sequence or CpG island); TaqMan® probes; optimised PCR buffers and 
deoxynucleo tides; and Taq polymerase. 

Typical reagents (e.g., as might be found in a typical Ms-SNuPE-based kit) for Ms-SNuPE 
analysis may include, but are not limited to: PCR primers for specific gene (or methylation- 
altered DNA sequence or CpG island); optimised PCR buffers and deoxynucleotides; gel ex- 
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traction kit; positive control primers; Ms-SNuPE primers for specific gene; reaction buffer 
(for the Ms-SNuPE reaction); and radioactive nucleotides. Additionally, bisulfite conversion 
reagents may include: DNA denaturation buffer; sulfonation buffer; DNA recovery regents or 
kit (e.g., precipitation, ultrafiltration, affinity column); desulfonation buffer; and DNA recov- 
ery components. 

Typical reagents (e.g., as might be found in a typical MSP-based kit) for MSP analysis may 
include, but are not limited to: methylated and unmethylated PGR primers for specific gene 
(or methylation-altered DNA sequence or CpG island), optimized PCR buffers and deoxynu- 
cleotides, and specific probes. 

In order to enable the disclosed method, the invention further provides the modified DNA of 
one or a combination of genes taken from the group EGR4, APC, CDKN2A, CSPG2, 
ERBB2, STMN1, STK11, CA9, PAX6, SFN, S100A2, TFF1, TGFBR2, TP53, TP73, PLAU, 
TMEFF2, ESR1, SYK, HSPB1, RASSF1, TES, GRIN2D, PSAT1, CGA, CYP2D6, 
COX7A2L, ESR2, PLAU, VTN, SULT1A1, PCAF, PRKCD, ONECUT2, BCL6, WBP11, 
(MX1)MX1, APP, ORC4L, NETOl, TBC1D3, GRB7, CDK6, SEQ ID NO: 47, SEQ ID NO: 
48, ABCA8, SEQ ID NO: 50, SEQ ID NO: 51, MARK2, ELK1, Q8WUT3, CGB, BSG, 
BCKDK, SOX8, DAG1, SEMA4B and ESR1 (exon8) as well as oligonucleotides and/or 
PNA-oligomers for detecting cytosine methylations within said genes. The present invention 
is based on the discovery that genetic and epigenetic parameters and, in particular, the cyto- 
sine methylation patterns of said genomic DNAs are particularly suitable for improved treat- 
ment and monitoring of breast cell proliferative disorders. 

The nucleic acids according to the present invention can be used for the analysis of genetic 
and/or epigenetic parameters of genomic DNA. 

This objective according to the present invention is achieved using a nucleic acid containing a 
sequence of at least 16 bases in length of the pretreated genomic DNA according to one of 
SEQ ID NO: 206 to SEQ ID NO: 449 and sequences complementary thereto. 

The modified nucleic acids could heretofore not be connected with the improved treatment of 
breast cell proliferative disorders by prediction of disease free survival and/or probability of 
response to treatment and/or characterisaton of the disease in terms of aggressiveness. 
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The object of the present invention is further achieved by an oligonucleotide or oligomer for 
the analysis of pretreated DNA, for detecting the genomic cytosine methylation state, said 
oligonucleotide containing at least one base sequence having a length of at least 10 nucleo- 
tides which hybridises to a pretreated genomic DNA according to SEQ ID NO: 206 to SEQ 
ID NO: 449 . The oligomer probes according to the present invention constitute important and 
effective tools which, for the first time, make it possible to ascertain specific genetic and epi- 
genetic parameters during the analysis of biological samples for features associated with a 
patient's disease free survival and/or response to endocrine treatment. Said oligonucleotides 
allow the improved treatment and monitoring of breast cell proliferative disorders. The base 
sequence of the oligomers preferably contains at least one CpG or TpG dinucleotide. The 
probes may also exist in the form of a PNA (peptide nucleic acid) which has particularly pre- 
ferred pairing properties. Particularly preferred are oligonucleotides according to the present 
invention in which the cytosine of the CpG dinucleotide is within the middle third of said oli- 
gonucleotide e.g. the 5^ - nucleotide from the 5 '-end of a 13-mer oligonucleotide; or in 
the case of PNA-oligomers, it is preferred for the cytosine of the CpG dinucleotide to be the 

4th _ gth nucleotide from the 5 '-end of the 9-mer. 

The oligomers according to the present invention are normally used in so called "sets" which 
contain upto two oligomers and up to one oligomer for each of the CpG dinucleotides within 
SEQ ID NO: 206 to SEQ ID NO: 449 . 

In the case of the sets of oligonucleotides according to the present invention, it is preferred 
that at least one oligonucleotide is bound to a solid phase. It is further preferred that all the 
oligonucleotides of one set are bound to a solid phase. 

The present invention further relates to a set of at least 2 n (oligonucleotides and/or PNA- 
oligomers) used for detecting the cytosine methylation state of genomic DNA, by analysis of 
said sequence or treated versions of said sequence (of the genes EGR4, APC, CDKN2A, 
CSPG2, ERBB2, STMN1, STK11, CA9, PAX6, SFN, S100A2, TFF1, TGFBR2, TP53, 
TP73, PLAU, TMEFF2, ESR1, SYK, HSPB1, RASSF1, TES, GRIN2D, PSAT1 3 CGA, 
CYP2D6, COX7A2L 5 ESR2, PLAU, VTN 5 SULT1A1, PCAF, PRKCD, ONECUT2, BCL6, 
WBP11, (MXl)MXl, APP ? ORC4L, NETOl, TBC1D3, GRB7, CDK6, SEQ ID NO: 47, 
SEQ ID NO: 48, ABCA8, SEQ ID NO: 50, SEQ ID NO: 51, MARK2, ELK1, Q8WUT3, 
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CGB 5 BSG, BCKDK, SOX8, DAG1, SBMA4B, ESR1 (exonS) as detailed in the sequence 
listing and Table 1) and sequences complementary thereto). These probes enable improved 
treatment and monitoring of breast cell proliferative disorders. 

It will be obvious to one skilled in the art that the method according to the invention will be 
improved and supplemented by the incorporation of markers and clinical indicators known in 
the state of the art and currently used as predictive of the outcome of therapies which target 
endocrine or endocrine associated pathways. More preferably said markers include node 
status, age, menopausal status, grade, estrogen and progesterone receptors. 

The genes that form the basis of the present invention may be used to form a M gene panel 11 , i.e. 
a collection comprising the particular genetic sequences of the present invention and/or their 
respective informative methylation sites. The formation of gene panels allows for a quick and 
specific analysis of specific aspects of breast cancer treatment. The gene panel(s) as described 
and employed in this invention can be used with surprisingly high efficiency for the treatment 
of breast cell proliferative disorders by prediction of the outcome of treatment with a therapy 
comprising one or more drugs which target the estrogen receptor pathway or are involved in 
estrogen metabolism, production, or secretion. The analysis of each gene of the panel contrib- 
utes to the evaluation of patient responsiveness, however, in a less preferred embodiment the 
patient evaluation may be achieved by analysis of only a single gene. The analysis of a single 
member of the "gene panel 5 would enable a cheap but less accurate means of evaluating pa- 
tient responsiveness, the analysis of multiple members of the panel would provide a rather 
more expensive means of carrying out the method, but with a higher accuracy (the technically 
preferred solution). 

The efficiency of the method according to the invention is improved when applied to patients 
who have not been treated with chemotherapy. Accordingly, it is a particularly preferred em- 
bodiment of the method wherein the method is used for the assessment of subjects who have 
not undergone chemotherapy. 

According to the present invention, it is preferred that an arrangement of different oligonu- 
cleotides and/or PNA-oligomers (a so-called "array") made available by the present invention 
is present in a manner that it is likewise bound to a solid phase. This array of different oligo- 
nucleotide- and/or PNA-oligomer sequences can be characterised in that it is arranged on the 
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solid phase in the form of a rectangular or hexagonal lattice. The solid phase surface is pref- 
erably composed of silicon, glass, polystyrene, aluminium, steel, iron, copper, nickel, silver, 
or gold. However, nitrocellulose as well as plastics such as nylon which can exist in the form 
of pellets or also as resin matrices are suitable alternatives. 

Therefore, a further subject matter of the present invention is a method for manufacturing an 
array fixed to a carrier material for the improved treatment and monitoring of breast cell pro- 
liferative disorders. In said method at least one oligomer according to the present invention is 
coupled to a solid phase. Methods for manufacturing such arrays are known, for example, 
from US Patent 5,744,305 by means of solid-phase chemistry and photolabile protecting 
groups. 

A further subject matter of the present invention relates to a DNA chip for the improved 
treatment and monitoring of breast cell proliferative disorders. The DNA chip contains at least 
one nucleic acid according to the present invention. DNA chips are known, for example, in 
US Patent 5,837,832. 

The oligomers according to the present invention or arrays thereof as well as a kit according 
to the present invention are intended to be used for the improved treatment and monitoring of 
breast cell proliferative disorders. According to the present invention, the method is prefera- 
bly used for the analysis of important genetic and/or epigenetic parameters within genomic 
DNA, in particular for use in improved treatment and monitoring of breast cell proliferative 
disorders. 

The methods according to the present invention are used, for improved treatment and moni- 
toring of breast cell proliferative disorder by enabling more informed therapeutic regimens. 

The present invention moreover relates to the diagnosis and/or prognosis of events which are 
disadvantageous or relevant to patients or individuals in which important genetic and/or epi- 
genetic parameters within genomic DNA, said parameters obtained by means of the present 
invention may be compared to another set of genetic and/or epigenetic parameters, the differ- 
ences serving as the basis for the diagnosis and/or prognosis of events which are disadvanta- 
geous or relevant to patients or individuals. 



WO 2005/059172 



-36- 



PCT/EP2004/014170 



In the context of the present invention the term "hybridisation" is to be understood as a bond 
of an oligonucleotide to a completely complementary sequence along the lines of the Watson- 
Crick base pairings in the sample DNA, forming a duplex structure. 

In the context of the present invention, "genetic parameters" are mutations and polymor- 
phisms of genomic DNA and sequences further required for their regulation. To be designated 
as mutations are, in particular, insertions, deletions, point mutations, inversions and polymor- 
phisms and, particularly preferred, SNPs (single nucleotide polymorphisms). 

In the context of the present invention the term "methylation state" is taken to mean the de- 
gree of methylation present in a nucleic acid of interest, this may be expressed in absolute or 
relative terms i.e. as a percentage or other numerical value or by comparison to another tissue 
and therein described as hypermethylated, hypomethylated or as having significantly similar 
or identical methylation status. 

In the context of the present invention the term "regulatory region" of a gene is taken to mean 
nucleotide sequences which affect the expression of a gene. Said regulatory regions may be 
located within, proximal or distal to said gene. Said regulatory regions include but are not 
limited to constitutive promoters, tissue-specific promoters, developmental-specific promot- 
ers, inducible promoters and the like. Promoter regulatory elements may also include certain 
enhancer sequence elements that control transcriptional or translational efficiency of the gene. 

In the context of the present invention the term "chemotherapy" is taken to mean the use of 
drugs or chemical substances to treat cancer. This definition excludes radiation therapy 
(treatment with high energy rays or particles), hormone therapy (treatment with hormones or 
hormone analogues (synthetic substitutes) and surgical treatment. 

In the context of the present invention, "epigenetic parameters" are, in particular, cytosine 
methylations and further modifications of DNA bases of genomic DNA and sequences further 
required for their regulation. Further epigenetic parameters include, for example, the acetyla- 
tion of histones which, cannot be directly analysed using the described method but which, in 
turn, correlates with the DNA methylation. 
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In the context of the present invention the term "adjuvant treatment" is taken to mean a ther- 
apy of a cancer patient immediately following an initial non chemotherapeutical therapy, e.g. 
surgery. In general, the purpose of an adjuvant therapy is to provide a significantly smaller 
risk of recurrences compared without the adjuvant therapy. 

In the context of the present invention the terms "estrogen receptor positive" and/or "proges- 
terone receptor positive" when used to describe a breast cell proliferative disorderare taken to 
mean that the proliferating cells expresses said hormone receptor. 

BEST MODE 

Characterization of a breast cancer in terms of prognosis and/or treatment outcome enables 
the physician to make an informed decision as to a therapeutic regimen with appropriate risk 
and benefit trade off s to the patient. 

In the context of the present mode of the invention the terms "estrogen receptor positive" 
and/or "progesterone receptor positive" when used to describe a breast cell proliferative dis- 
order are taken to mean that the proliferating cells express said hormone receptor. 

In the context of the present mode of the invention the term 'aggressiveness' is taken to mean 
one or more of high likelihood of relapse post surgery; below average or below median pa- 
tient survival; below average or below median disease free survival; below average or below 
median relapse-free survival; above average tumor-related complications; fast progression of 
tumor or metastases. According to the aggressiveness of the disease an appropriate treatment 
or treatments may be selected from the group consisting of chemotherapy, radiotherapy, sur- 
gery, biological therapy, immunotherapy, antibody treatments, treatments involving molecu- 
larly targeted drugs, estrogen receptor modulator treatments, estrogen receptor down-regulator 
treatments, aromatase inhibitors treatments, ovarian ablation, treatments providing LHRH 
analogues or other centrally acting drugs influencing estrogen production. Wherein a cancer is 
characterized as 'aggressive' it is particularly preferred that a treatment such as, but not lim- 
ited to, chemotherapy is provided in addition to or instead of an endocrine targeting therapy. 
Indicators of tumor aggressiveness standard in the art include but are not limited to, tumor 
stage, tumor grade, nodal status and survival. 
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Unless stated otherwise as used herein the term "survival" shall be taken to include all of the 
following: survival until mortality, also known as overall survival (wherein said mortality 
may be either irrespective of cause or breast tumor related); "recurrence-free survival" 
(wherein the term recurrence shall include both localized and distant recurrence) ; metastasis 
free survival; disease free survival (wherein the term disease shall include breast cancer and 
diseases associated therewith). The length of said survival may be calculated by reference to a 
defined start point (e.g. time of diagnosis or start of treatment) and end point (e.g. death, re- 
currence or metastasis). 

As used herein the term "prognostic marker" shall be taken to mean an indicator of the likeli- 
hood of progression of the disease, in particular aggressiveness and metastatic potential of a 
breast tumor. 

As used herein the term 'predictive marker 9 shall be taken to mean an indicator of response to 
therapy, said response is preferably defined according to patient survival. It is preferably used 
to define patients with high, low and intermediate length of survival or recurrence after treat- 
ment, that is the result of the inherent heterogeneity of the disease process. 

As defined herein the term predictive marker may in some situations fall within the remit of a 
herein described "prognostic marker 5 , for example, wherein a prognostic marker differentiates 
between patients with different survival outcomes pursuant to a treatment, said marker is also 
a predictive marker for said treatment. Therefore, unless otherwise stated the two terms shall 
not be taken to be mutually exclusive. 

As used herein the term 'expression 5 shall be taken to mean the transcription and translation 
of a gene, as well as the genetic or the epigenetic modifications of the genomic DNA associ- 
ated with the marker gene and/or regulatory or promoter regions thereof. Genetic modifica- 
tions include SNPs, point mutations, deletions, insertions, repeat length, rearrangements and 
other polymorphisms. The analysis of either the expression levels of protein, or mRNA or the 
analysis of the patient's individual genetic or epigenetic modification of the marker gene are 
herein summarized as the analysis of 'expression of the gene. 

The level of expression of a gene may be determined by the analysis of any factors associated 
with or indicative of the level of transcription and translation of a gene including but not lim- 
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ited to raethylation analysis, loss of heterozygosity (hereinafter also referred to as LOH), 
RNA expression levels and protein expression levels. 

Furthermore the activity of the transcribed gene may be affected by genetic variations such as 
but not limited to genetic modifications (including but not limited to SNPs 5 point mutations, 
deletions, insertions, repeat length, rearrangements and other polymorphisms). 

The terms "endocrine therapy" or "endocrine treatment" are meant to comprise any therapy, 
treatment or treatments targeting the estrogen receptor pathway or estrogen synthesis pathway 
or estrogen conversion pathway, which is involved in estrogen metabolism, production or 
secretion. Said treatments include, but are not limited to estrogen receptor modulators, estro- 
gen receptor down-regulators, aromatase inhibitors, ovarian ablation, LHRH analogues and 
other centrally acting drugs influencing estrogen production. 

The term "monotherapy" shall be taken to mean the use of a single drug or other therapy. 

In the context of the present mode of the invention the term "chemotherapy" is taken to mean 
the use of pharmaceutical or chemical substances to treat cancer. This definition excludes 
radiation therapy (treatment with high energy rays or particles), hormone therapy (treatment 
with hormones or hormone analogues) and surgical treatment. 

In the context of the present mode of the invention the term "adjuvant treatment" is taken to 
mean a therapy of a cancer patient immediately following an initial non chemotherapeutical 
therapy, e.g. surgery. In general, the purpose of an adjuvant therapy is to decrease the risk of 
recurrence. 

In the context of the present mode of the invention the term "determining a suitable treatment 
regimen for the subject" is taken to mean the determination of a treatment regimen (i.e. a sin- 
gle therapy or a combination of different therapies that are used for the prevention and/or 
treatment of the cancer in the patient) for a patient that is started, modified and/or ended based 
or essentially based or at least partially based on the results of the analysis according to the 
present mode of the invention. One example is starting an adjuvant endocrine therapy after 
surgery, another would be to modify the dosage of a particular chemotherapy. The determina- 
tion can, in addition to the results of the analysis according to the present mode of the inven- 
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tion, be based on personal characteristics of the subject to be treated. In most cases, the actual 
determination of the suitable treatment regimen for the subject will be performed by the at- 
tending physician or doctor. 

In the context of this mode of the invention the terms "obtaining a biological sample" or 
"obtaining a sample from a subject", shall not be taken to include the active retrieval of a 
sample from an individual, e.g. the performance of a biopsy. Said terms shall be taken to 
mean the obtainment of a sample previously isolated from an individual. Said samples may be 
isolated by any means standard in the art, including but not limited to biopsy, surgical re- 
moval, body fluids isolated by means of aspiration. Furthermore said samples may be pro- 
vided by third parties including but not limited to clinicians, couriers, commercial sample 
providers and sample collections. 

In the context of the present mode of the invention, the term "CpG island" refers to a contigu- 
ous region of genomic DNA that satisfies the criteria of (1) having a frequency of CpG dinu- 
cleotides corresponding to an "Observed/Expected Ratio" >0.6, and (2) having a "GC Con- 
tent" >0.5. CpG islands are typically, but not always, between about 0.2 to about 1 kb in 
length. 

In the context of the present mode of the invention the term "regulatory region" of a gene is 
taken to mean nucleotide sequences which affect the expression of a gene. Said regulatory 
regions may be located within, proximal or distal to said gene. Said regulatory regions include 
but are not limited to constitutive promoters, tissue-specific promoters, developmental- 
specific promoters, inducible promoters and the like. Promoter regulatory elements may also 
include certain enhancer sequence elements that control transcriptional or translational effi- 
ciency of the gene. 

In the context of the present mode of the invention, the term "methylation" refers to the pres- 
ence or absence of 5-methylcytosine ("5-mCyt") at one or a plurality of CpG dinucleotides 
within a DNA sequence. 

In the context of the present mode of the invention the term "methylation state" is taken to 
mean the degree of methylation present in a nucleic acid of interest, this may be expressed in 
absolute or relative terms i.e. as a percentage or other numerical value or by comparison to 
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another tissue and therein described as hypermethylated, hypomethylated or as having signifi- 
cantly similar or identical methylation status. 

In the context of the present mode of the invention, the term "hemi-methylation" or 
"hemimethylation" refers to the methylation state of a CpG methylation site, where only a 
single cytosine in one of the two CpG dinucleotide sequences of the double stranded CpG 

methylation site is methylated (e.g., 5 5 -NNC M GNN-3' (top strand): 3'-NNGCNN-5' (bottom 
strand)). 

In the context of the present mode of the invention, the term "hypermethylation" refers to the 
average methylation state corresponding to an increased presence of 5-mCyt at one or a plu- 
rality of CpG dinucleotides within a DNA sequence of a test DNA sample, relative to the 
amount of 5-mCyt found at corresponding CpG dinucleotides within a normal control DNA 
sample. 

In the context of the present mode of the invention, the term "hypomethylation" refers to the 
average methylation state corresponding to a decreased presence of 5-mCyt at one or a plu- 
rality of CpG dinucleotides within a DNA sequence of a test DNA sample, relative to the 
amount of 5-mCyt found at corresponding CpG dinucleotides within a normal control DNA 
sample. 

In the context of the present mode of the invention, the term "microarray" refers broadly to 
both "DNA microarrays," and 'DNA chip(s),' as recognized in the art, encompasses all art- 
recognized solid supports, and encompasses all methods for affixing nucleic acid molecules 
tiiereto or synthesis of nucleic acids thereon. 

"Genetic parameters" are mutations and polymorphisms of genes and sequences further re- 
quired for their regulation. To be designated as genetic modifications or mutations are, in par- 
ticular, insertions, deletions, point mutations, inversions and polymorphisms and, particularly 
preferred, SNPs (single nucleotide polymorphisms). 

"Epigenetic modifications" or "epigenetic parameters" are modifications of DNA bases of 
genomic DNA and sequences further required for their regulation, in particular, cytosine 
methylations thereof. Further epigenetic parameters include, for example, the acetylation of 



WO 2005/059172 



-42- 



PCT/EP2004/014170 



histones which, however, cannot be directly analyzed using the described method but which, 
in turn, correlate with the DNA methylation. 

In the context of the present mode of the invention, the term "bisulfite reagent' 5 refers to a 
reagent comprising bisulfite, disulfite, hydrogen sulfite or combinations thereof, useful as 
disclosed herein to distinguish between methylated and unmethylated CpG dinucleotide se- 
quences. 

In the context of the present mode of the invention, the term "Methylation assay" refers to any 
assay for determining the methylation state of one or more CpG dinucleotide sequences 
within a sequence of DNA. 

In the context of the present mode of the invention, the term "MS.AP-PCR" (Methylation- 
Sensitive Arbitrarily-Primed Polymerase Chain Reaction) refers to the art-recognized tech- 
nology that allows for a global scan of the genome using CG-rich primers to focus on the re- 
gions most likely to contain CpG dinucleotides, and described by Gonzalgo et al., Cancer 
Research 57:594-599, 1997. 

In the context of the present mode of the invention, the term "MethyLight" refers to the art- 
recognized fluorescence-based real-time PGR technique described by Eads et al, Cancer Res. 
59:2302-2306, 1999. 

In the context of the present mode of the invention, the term "HeavyMethyl™" assay, in the 
embodiment thereof implemented herein, refers to a methylation assay comprising methyla- 
tion specific blocking probes covering CpG positions between the amplification primers. 

The term "Ms-SNuPE" (Methylation-sensitive Single Nucleotide Primer Extension) refers to 
the art-recognized assay described by Gonzalgo and Jones, Nucleic Acids Res. 25:2529-2531, 
1997. 

In the context of the present mode of the invention the term "MSP" (Methylation-specific 
PCR) refers to the art-recognized methylation assay described by Herman et al. Proc. Natl 
Acad. Set USA 93:9821-9826, 1996, and by US Patent No. 5,786,146. 
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In the context of the present mode of the invention the term "COBRA" (Combined Bisulfite 
Restriction Analysis) refers to the art-recognized methylation assay described by Xiong and 
Laird, Nucleic Acids Res. 25:2532-2534, 1997. 

In the context of the present mode of the invention the term "hybridization" is to be under- 
stood as a bond of an oligonucleotide to a complementary sequence along the lines of the 
Watson-Crick base pairings in the sample DNA, forming a duplex structure. 

"Stringent hybridization conditions," as defined herein, involve hybridizing at 68°C in 5x 
SSC/5x Denhardfs solution/1.0% SDS, and washing in 0.2x SSC/0.1% SDS at room tem- 
perature, or involve the art-recognized equivalent thereof (e.g., conditions in which a hybridi- 
zation is carried out at 60°C in 2.5 x SSC buffer, followed by several washing steps at 37°C in 
a low buffer concentration, and remains stable). Moderately stringent conditions, as defined 
herein, involve including washing in 3x SSC at 42°C, or the art-recognized equivalent thereof. 
The parameters of salt concentration and temperature can be varied to achieve the optimal 
level of identity between the probe and the target nucleic acid. Guidance regarding such con- 
ditions is available in the art, for example, by Sambrook et al., 1989, Molecular Cloning, A 
Laboratory Manual, Cold Spring Harbor Press, N.Y.; and Ausubel et al. (eds.), 1995, Current 
Protocols in Molecular Biology, (John Wiley and Sons, N.Y.) at Unit 2.10. 

"Background DNA" as used herein refers to any nucleic acids which originate from sources 
other than breast cells. 

Using the methods and nucleic acids described herein, statistically significant models of pa- 
tient relapse, disease free survival, metastasis free survival, overall survival and/or disease 
progression can be developed and utilized to assist patients and clinicians in determining suit- 
able treatment options to be included in the therapeutic regimen. 

In one aspect the method provides a prognostic marker for a cell proliferative disorder of the 
breast tissues. Preferably this prognosis is provided in terms of an outcome selected from the 
group consisting of likelihood of relapse; overall patient survival; metastasis free survival; 
disease free survival or disease progression. 
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In a further aspect of the invention said marker is used as a predictive marker of outcome of a 
treatment which targets the estrogen receptor pathway or is involved in estrogen metabolism, 
production or secretion as a therapy for patients suffering from a cell proliferative disorder of 
the breast tissues. This aspect of the method enables the physician to determine which treat- 
ments may be used in addition to or instead of said endocrine treatment. It is preferred that 
said additional treatment is a more aggressive therapy such as, but not limited to 5 chemother- 
apy. Thus, the present invention will be seen to reduce the problems associated with present 
breast cell proliferative disorder prognostic, predictive and treatment response prediction 
methods. 

Using the methods and nucleic acids as described herein, patient survival can be evaluated 
before or during treatment for a cell proliferative disorder of the breast tissues, in order to 
provide critical information to the patient and clinician as to the likely progression of the dis- 
ease. It will be appreciated, therefore, that the methods and nucleic acids exemplified herein 
can serve to improve a patient's quality of life and odds of treatment success by allowing both 
patient and clinician a more accurate assessment of the patient's treatment options. 

The herein disclosed method may be used for the improved treatment of all breast cell prolif- 
erative disorder patients, both pre- and post- menopausal and independent of their node or 
estrogen receptor status. However, it is particularly preferred that said patients are node- 
negative and estrogen receptor positive. 

The present invention makes available a method for the improved treatment of breast cell 
proliferative disorders, by enabling the improved prediction of a patient's survival, in particu- 
lar by predicting the likelihood of relapse post-surgery both with or without adjuvant endo- 
crine treatment. Furthermore, the present invention provides a means for the improved pre- 
diction of treatment outcome with endocrine therapy, wherein said therapy comprises one or 
more treatments which target the estrogen receptor pathway or are involved in estrogen me- 
tabolism, production, or secretion. 

The method according to the invention may be used for the analysis of a wide variety of cell 
proliferative disorders of the breast tissues including, but not limited to, ductal carcinoma in 
situ, invasive ductal carcinoma, invasive lobular carcinoma, lobular carcinoma in situ, come- 
docarcinoma, inflammatory carcinoma, mucinous carcinoma, scirrhous carcinoma, colloid 



WO 2005/059172 



-45 - 



PCT/EP2004/014170 

> 



carcinoma, tubular carcinoma, medullary carcinoma, metaplastic carcinoma, and papillary 
carcinoma and papillary carcinoma in situ, undifferentiated or anaplastic carcinoma and Pa- 
get's disease of the breast. 

The method according to the invention may be used to provide a prognosis of breast cell pro- 
liferative disorder patients, furthermore said method may be used to provide a prediction of 
patient survival and/or relapse following treatment by endocrine therapy. 

Wherein the herein disclosed markers, methods and nucleic acids are used as prognostic 
markers it is particularly preferred that said prognosis is defined in terms of patient survival 
and/or relapse. In this embodiment patients survival times and/or relapse are predicted ac- 
cording to their gene expression or genetic or epigenetic modifications thereof. In this aspect 
of the invention it is particularly preferred that said patients are tested prior to receiving any 
adjuvant endocrine treatment. 

Wherein the herein disclosed markers, methods and nucleic acids are used as predictive mark- 
ers it is particularly preferred that the method is applied to predict the outcome of patients 
who receive endocrine treatment as secondary treatment to an initial non chemotherapeutical 
therapy, e.g. surgery (hereinafter referred to as the 'adjuvant setting') as illustrated in Figure 
1 . Such a treatment is often prescribed to patients suffering from Stage 1 to 3 breast carcino- 
mas. It is also preferred that said 'outcome' is defined in terms of patients survival and/or re- 
lapse. 

In this embodiment patients survival times and/or relapse are predicted according to their 
gene expression or genetic or epigenetic modifications thereof. By detecting patients with 
below average or below median metastasis free survival or disease free survival times and/or 
high likelihood of relapse the physician may choose to recommend the patient for further 
treatment, instead of or in addition to the endocrine targeting therapy(s), in particular but not 
limited to, chemotherapy. 

The herein described invention provides a novel breast cell proliferative disorder prognostic 
and predictive biomarker. 
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It is herein described that aberrant expression of the gene PITX2 and/or regulatory or pro- 
moter regions thereof is correlated to prognosis and/or prediction of outcome of estrogen 
treatment of breast cell proliferative disorder patients, in particular breast carcinoma. 

This marker thereby provides a novel means for the characterization of breast cell prolifera- 
tive disorders. As described herein determination of the expression of the gene PITX2 and/or 
regulatory or promoter regions thereof enables the prediction of prognosis of a patient with a 
proliferative disorder of the breast tissues. In an alternative embodiment the expression of the 
gene PITX2 and/or regulatory or promoter regions thereof enables the prediction of treatment 
response of a patient treated with one or more treatments which target the estrogen receptor, 
synthesis or conversion pathways or are otherwise involved in estrogen metabolism, produc- 
tion or secretion. 

The herein described invention is thereby useful for the differentiation of individuals who 
may be appropriately treated with one or more treatments which target the estrogen receptor 
pathway or are involved in estrogen metabolism, production or secretion from those individu- 
als, who would be optimally treated with other treatments in addition to said treatment. Pre- 
ferred 'other treatments' include but are not limited to chemotherapy or radiotherapy. It is 
particularly preferred that said prognosis and/or treatment response is stated in terms of likeli- 
hood of relapse, survival or outcome. 

In a further embodiment of the invention the aberrant expression of a plurality of genes com- 
prising the gene PITX2 and/or regulatory or promoter regions thereof is analyzed. Said plu- 
rality of genes is hereinafter also referred to as a 'gene panel'. The analysis of multiple genes 
increases the accuracy of a provided prognosis and/or prediction of estrogen treatment out- 
come. It is preferred that the gene panel consists of up to seven genes and/or their promoter 
regions associated with prognosis and/or prediction of treatment response of breast carcinoma 
patients. It is further preferred that said panel consists of the gene PITX2 and one or more 
genes selected from the group consisting of ABCA8, CDK6, ERBB2, ONECUT2, PLAU, 
TBC1D3 and TFF1 and/or regulatory regions thereof. It is particularly preferred that the gene 
panel is selected from the group of gene panels consisting of: 

• PITX2, PLAU and TFF 1 

• PITX2 and PLAU 

• PITX2 and TFF1 
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It is particularly preferred that the gene panel consisting PITX2 and TFF1 is used to predict 
outcome of treatment of patients with an endocrine treatment. It is particularly preferred that 
the gene panel consisting PITX2 and PLAU is used to provide a prognosis of patients. It is 
preferred that said patients are analyzed prior to receiving any endocrine treatment. 

In further embodiments this invention relates to new methods and sequences for the prognosis 
of patients diagnosed with breast cell proliferative disease. In a further aspect the invention 
relates to new methods and sequences, which may be used as tools for the selection of suit- 
able treatments of patients diagnosed with breast cell proliferative disease based on a predic- 
tion of likelihood of relapse, survival or outcome. 

More specifically this invention provides new methods and sequences for patients diagnosed 
with breast cell proliferative disease, allowing the improved selection of suitable adjuvant 
therapy. Furthermore, it is preferred that patients with poor prognosis following endocrine 
monotherapy are provided with chemotherapy in addition to or instead of an endocrine ther- 
apy. 

One aspect of the invention is the provision of methods for providing a prognosis and/or pre- 
diction of outcome of endocrine treatment of a patient with a cell proliferative disorder of the 
breast tissues. Preferably said prognosis and/or prediction is provided in terms of likelihood of 
relapse or the survival of said patient. It is further preferred that said survival is disease free 
survival or metastasis free survival. It is also preferred that said disease is breast cancer. These 
methods comprise the analysis of the expression levels of the gene PITX2 and/or regulatory 
regions thereof. 

In further embodiments the method comprises analysis of the expression of a 'gene panel' 
comprising the gene PITX2 and one or more genes selected from the group consisting of 
ABCA8, CDK6, ERBB2, ONECUT2, PLAU, TBC1D3 and TFF1 and/or regulatory regions 
thereof. It is particularly preferred that said gene panels are selected from the group of gene 
panels consisting of: 

• PITX2, PLAU and TFF 1 

• PITX2 and PLAU 

• PITX2 and TFF1 
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It is particularly preferred that the expression of the gene panel consisting PITX2 and TFF1 is 
determined in order to predict outcome of treatment of patients with an endocrine treatment. It 
is also particularly preferred that the expression of the gene panel consisting PITX2 and 
PLAU is determined in order to provide a prognosis of patients. It is preferred that said pa- 
tients are analyzed prior to receiving any endocrine treatment. 

Determination of expression may be achieved by any means standard in the art, however it is 
most preferably achieved by analysis of LOH, methylation, protein expression, mRNA ex- 
pression, genetic or other epigenetic modifications of the genomic sequences. 

Especially preferred is the analysis of the DNA methylation profile of the genomic sequence 
of the gene PITX2 and/or regulatory or promoter regions thereof as given in SEQ ID NO: 
1130. Further preferred is the analysis of the methylation status of CpG positions within the 
following sections of SEQ ID NO: 1 130 nucleotide 2,700-nucleotide 3,000; nucleotide 3,900- 
nucleotide 4,200; nucleotide 5,500-nucleotide 8,000; nucleotide 13,500-nucleotide 14,500; 
nucleotide 16,500-nucleotide 18,000; nucleotide 18,500-nucleotide 19,000; nucleotide 
21,000-nucleotide 22,500. Especially preferred is the analysis of the methylation status of 
eight specific CpG dinucleotides, covered in the four sub-sequences of said SEQ ID NO: 
1130 given in SEQ ID NOs: 23, 1 140-1142. Wherein the method comprises analysis of a gene 
panel comprising the PITX2 and one or more genes selected from the group consisting 
ABCA8, CDK6, ERBB2, ONECUT2, PLAU, TBC1D3 and TFF1 and/or regulatory or pro- 
moter regions thereof it is preferred that the sequence of said genes is selected from the group 
consisting of SEQ ID NO: 49, SEQ ID NO: 46, SEQ ID NO: 5, SEQ ID NO: 35, SEQ ID NO: 
16, SEQ ID NO: 43, SEQ ID NO: 12 AND SEQ ID NO: 1131 according to Table 1. 

This methodology presents further improvements over the state of the art in that the method 
may be applied to any subject, independent of the estrogen and/or progesterone receptor 
status. Therefore in a preferred embodiment, the subject is not required to have been tested for 
estrogen or progesterone receptor status. 

In further aspects of the invention, the disclosed matter provides novel nucleic acid sequences 
useful for the analysis of methylation within said gene, other aspects provide novel uses of the 
gene and the gene product as well as methods, assays and kits directed to providing a progno- 
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sis and/or predicting outcome of endocrine treatment of a patient diagnosed with breast cell 
proliferative disease. 

In one embodiment the invention discloses a method for providing the prognosis and/or pre- 
dicting outcome of endocrine treatment of a patient suffering from a breast cell proliferative 
disease, by analysis of expression of the gene PITX2 and/or regulatory regions thereof. Pref- 
erably said endocrine treatment is an adjuvant endocrine monotherapy. Said method may be 
enabled by means of any analysis of the expression of the gene, including but not limited to 
mRNA expression analysis or protein expression analysis or by analysis of its genetic modifi- 
cations leading to an altered expression (including LOH). However, in the most preferred em- 
bodiment of the invention, said expression is determined by means of analysis of the meth- 
yl ation status of CpG sites within the gene PITX2 and its promoter or regulatory elements. 

In one embodiment of the method aberrant expression of the gene PITX2 and/or panels 
thereof may be detected by analysis of loss of heterozygosity of the gene. In a first step ge- 
nomic DNA is isolated from a biological sample of the patient's tumor. The isolated DNA is 
then analyzed for LOH by any means standard in the art including but not limited to amplifi- 
cation of the gene locus or associated microsatellite markers. Said amplification may be car- 
ried out by any means standard in the art including polymerase chain reaction (PCR), strand 
displacement amplification (SDA)and isothermal amplification. 

The level of amplificate is then detected by any means known in the art including but not 
limited to gel electrophoresis and detection by probes (including Real Time PCR). Further- 
more the amplificates may be labeled in order to aid said detection. Suitable detectable labels 
include but are not limited to fluorescence label, radioactive labels and mass labels the suit- 
able use of which shall be described herein. 

The detection of a decreased amount of an amplificate corresponding to one of the amplified 
alleles in a test sample as relative to that of a heterozygous control sample is indicative of 
LOH. 

To detect the levels of mRNA encoding PITX2 and/or panels comprising said gene in a de- 
tection system for breast cancer relapse, a sample is obtained from a patient. Said obtaining of 
a sample is preferably not meant to be retrieving of a sample, as in performing a biopsy, but 
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rather directed to the availability of an isolated biological material representing a specific tis- 
sue, relevant for the intended use. The sample can be a tumor tissue sample from the surgi- 
cally removed tumor, a biopsy sample as taken by a surgeon and provided to the analyst or a 
sample of blood, plasma, serum or the like. The sample may be treated to extract the nucleic 
acids contained therein. The resulting nucleic acid from the sample is subjected to gel electro- 
phoresis or other separation techniques. Detection involves contacting the nucleic acids and in 
particular the mRNA of the sample with a DNA sequence serving as a probe to form hybrid 
duplexes. The stringency of hybridization is determined by a number of factors during hy- 
bridization and during the washing procedure, including temperature, ionic strength, length of 
time and concentration of formamide. These factors are outlined in, for example, Sambrook et 
al. (Molecular Cloning: A Laboratory Manual, 2nd ed., 1989). Detection of the resulting du- 
plex is usually accomplished by the use of labeled probes. Alternatively, the probe may be 
unlabeled, but may be detectable by specific binding with a ligand which is labeled, either 
directly or indirectly. Suitable labels and methods for labeling probes and ligands are known 
in the art, and include, for example, radioactive labels which may be incorporated by known 
methods (e.g., nick translation or kinasing), biotin, fluorescent groups, chemiluminescent 
groups (e.g., dioxetanes, particularly triggered dioxetanes), enzymes, antibodies, and the like. 

In order to increase the sensitivity of the detection in a sample of mRNA encoding PITX2 
and/or panels comprising said gene, the technique of reverse transcription/polymerization 
chain reaction can be used to amplify cDNA transcribed from mRNA encoding PITX2 and/or 
panels comprising said gene. The method of reverse transcription/PCR is well known in the 
art (for example, see Watson and Fleming, supra). 

The reverse transcription/PCR method can be performed as follows. Total cellular RNA is 
isolated by, for example, the standard guanidium isothiocyanate method and the total RNA is 
reverse transcribed. The reverse transcription method involves synthesis of DNA on a tem- 
plate of RNA using a reverse transcriptase enzyme and a 3' end primer. Typically, the primer 
contains an oligo(dT) sequence. The cDNA thus produced is then amplified using the PCR 
method and PITX2 and/or panels comprising said gene specific primers. (Belyavsky et al, 
Nucl Acid Res 17:2919-2932, 1989; Krug and Berger, Methods in Enzymology, Academic 
Press,N.Y., Vol.152, pp. 316-325, 1987 which are incorporated by reference) 
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The present invention may also be described in certain embodiments as a kit for use in pre- 
dicting the likelihood of relapse and/or survival of a breast cancer patient before or after sur- 
gical tumor removal with or without adjuvant endocrine monotherapy state through testing of 
a biological sample. A representative kit may comprise one or more nucleic acid segments as 
described above that selectively hybridize to PITX2 mRNA and/or mRNA from genes of a 
panel comprising said PITX2 gene, and a container for each of the one or more nucleic acid 
segments. In certain embodiments the nucleic acid segments may be combined in a single 
tube. In further embodiments, the nucleic acid segments may also include a pair of primers for 
amplifying the target mRNA. Such kits may also include any buffers, solutions, solvents, en- 
zymes, nucleotides, or other components for hybridization, amplification or detection reac- 
tions. Preferred kit components include reagents for reverse transcription-PCR, in situ hy- 
bridization, Northern analysis and/or RPA. 

The present invention further provides for methods to detect the presence of the polypep- 
tide^) of, PITX2 and/or panels comprising said protein, in a sample obtained from a patient. 
It is preferred that said sequence is essentially the same as the sequence as given in Figure 
1 07. Any method known in the art for detecting proteins can be used. Such methods include, 
but are not limited to immunodiffusion, Immunoelectrophoresis, immunochemical methods, 
binder-ligand assays, immunohistochemical techniques, agglutination and complement as- 
says, (for example see Basic and Clinical Immunology, Sites and Terr, eds., Appleton and 
Lange, Norwalk, Conn, pp 217-262, 1991 which is incorporated by reference). Preferred are 
binder-ligand hnmuaoassay methods including reacting antibodies with an epitope or epitopes 
of PITX2 and/or panels thereof and competitively displacing a labeled PITX2 protein and/or 
panels thereof or derivatives thereof. 

Certain embodiments of the present invention comprise the use of antibodies specific to the 
polypeptide encoded by the gene PITX2 and/or panels comprising said gene. Such antibodies 
may be useful for providing a prognosis of the likelihood of relapse and/or survival of a breast 
cancer patient preferably under adjuvant endocrine monotherapy by comparing a patient's 
levels of PITX2 marker expression and/or the expression of panels comprising PITX2 to ex- 
pression of the same marker(s) in normal individuals. In certain embodiments the production 
of monoclonal or polyclonal antibodies can be induced by the use of the PITX2 and/or other 
polypeptides of the panels as antigene. Such antibodies may in turn be used to detect ex- 
pressed proteins as markers for prognosis of relapse of a breast cancer patient under adjuvant 
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endocrine monotherapy. The levels of such proteins present in the peripheral blood of a pa- 
tient may be quantified by conventional methods. Antibody-protein binding may be detected 
and quantified by a variety of means known in the art, such as labeling with fluorescent or 
radioactive ligands. The invention further comprises kits for performing the above-mentioned 
procedures, wherein such kits contain antibodies specific for the PITX2 and/or panels thereof 
polypeptides. 

Numerous competitive and non-competitive protein binding immunoassays are well known in 
the art. Antibodies employed in such assays may be unlabeled, for example as used in agglu- 
tination tests, or labeled for use a wide variety of assay methods. Labels that can be used in- 
clude radionuclides, enzymes, fluorescers, chemiluminescers, enzyme substrates or co-factors, 
enzyme inhibitors, particles, dyes and the like for use in radioimmunoassay (RIA), enzyme 
immunoassays, e.g., enzyme-linked immunosorbent assay (ELISA), fluorescent immunoas- 
says and the like. Polyclonal or monoclonal antibodies to PITX2 and/or panels thereof or an 
epitope thereof can be made for use in immunoassays by any of a number of methods known 
in the art. One approach for preparing antibodies to a protein is the selection and preparation 
of an amino acid sequence of all or part of the protein, chemically synthesising the sequence 
and injecting it into an appropriate animal, usually a rabbit or a mouse (Milstein and Kohler 
Nature 256:495-497, 1975; Gulfre and Milstein, Methods in Enzymology: Immunochemical 
Techniques 73:1-46, Langone and Banatis eds., Academic Press, 1981 which are incorporated 
by reference). Methods for preparation of PITX2 and/or panels thereof or an epitope thereof 
include, but are not limited to chemical synthesis, recombinant DNA techniques or isolation 
from biological samples. 

In one aspect the invention provides significant improvements over the state of the art in that 
it is the first single marker that can be used to predict the likelihood of relapse or of survival 
of a breast cancer patient under adjuvant endocrine monotherapy. 

In the most preferred embodiment of the invention the analysis of expression is carried out by 
means of methylation analysis. It is further preferred that the methylation state of the CpG 
dinucleotides within the genomic sequence according to SEQ ID NO: 1130 and sequences 
complementary thereto is analyzed. SEQ ID NO: 1130 discloses the gene PITX2 and its pro- 
moter and regulatory elements thereof, wherein said fragment comprises CpG dinucleotides 
exhibiting a prognosis and/or predicting outcome of endocrine treatment specific methylation 
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pattern. Further preferred is the analysis of the methylation status of CpG positions within the 
following sections of SEQ ID NO: 1130 nucleotide 2,700-nucleotide 3,000; nucleotide 3,900- 
nucleotide 4,200; nucleotide 5,500-nucleotide 8,000; nucleotide 13,500-nucleotide 14,500; 
nucleotide 16,500-nucleotide 18,000; nucleotide 18,500-nucleotide 19,000; nucleotide 
21,000-nucleotide 22,500. Also preferred is the analysis of the sub-sequence of the gene 
PITX2 as shown in SEQ ID NO: 23. 

Wherein the method comprises analysis of the expression of a 'gene panel' comprising the 
gene and/or regulatory or promoter regions thereof and one or more genes selected from the 
group consisting ABCA8, CDK6, ERBB2, ONECUT2, PLAU, TBC1D3 and TFF1 it is al- 
most most preferred that said analysis of expression is carried out by means of methylation 
analysis. It is particularly preferred that the CpG methylation of the gene panels selected from 
the group of gene panels consisting: 

• PITX2, PLAU and TFF 1 

• PITX2 and PLAU 

• PITX2 and TFF1 
is analyzed. 

It is particularly preferred that the methylation of the gene panel consisting PITX2 and TFF1 
is determined in order to predict outcome of treatment of patients with an endocrine treat- 
ment. It is also particularly preferred that the methylation of the gene panel consisting PITX2 
and PLAU is determined in order to provide a prognosis of patients. It is preferred that said 
patients are analyzed prior to receiving any endocrine treatment. 

Hypermethylation of PITX2 and selected other genes as herein and/or sequences thereof are 
associated with poor prognosis and/or outcome of endocrine treatment of breast cell prolif- 
erative disorders, most preferably breast carcinoma. 

The methylation pattern of the gene PITX2 and its promoter and regulatory elements have 
heretofore not been analyzed with regard to prognosis or prediction of outcome of endocrine 
treatment of a patient diagnosed with a breast cell proliferative disorder. Due to the degener- 
acy of the genetic code, the sequence as identified in SEQ ID NO: 1 130 should be interpreted 
so as to include all substantially similar and equivalent sequences upstream of the promoter 
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region of a gene which encodes a polypeptide with the biological activity of that encoded by 
PITX2. 

Most preferably, the following method is used to detect methylation within the gene PITX2 
and/ or regulatory or promoter regions thereof wherein said methylated nucleic acids are pres- 
ent in an excess of background DNA, wherein the background DNA is present in 100 to 1000 
times the concentration of the DNA to be detected. 

The method for the analysis of methylation comprises contacting a nucleic acid sample ob- 
tained from a subject with at least one reagent or a series of reagents, wherein said reagent or 
series of reagents, distinguishes between methylated and non-methylated CpG dinucleotides 
within the target nucleic acid. 

Preferably, said method comprises the following steps: In the first step, a sample of the tissue 
to be analyzed is obtained. The source may be any suitable source, preferably, the source of 
the sample is selected from the group consisting of histological slides, biopsies, paraffin- 
embedded tissue, bodily fluids, plasma, serum, stool, urine, blood, nipple aspirate and combi- 
nations thereof. Preferably, the source is tumor tissue, biopsies, serum, urine, blood or nipple 
aspirate. The most preferred source, is the tumor sample, surgically removed from the patient 
or a biopsy sample of said patient. 

The DNA is then isolated from the sample. Genomic DNA may be isolated by any means 
standard in the art, including the use of commercially available kits. Briefly, wherein the 
DNA of interest is encapsulated in/by a cellular membrane the biological sample must be dis- 
rupted and lysed by enzymatic, chemical or mechanical means. The DNA solution may then 
be cleared of proteins and other contaminants e.g. by digestion with proteinase K. The geno- 
mic DNA is then recovered from the solution. This may be carried out by means of a variety 
of methods including salting out, organic extraction or binding of the DNA to a solid phase 
support. The choice of method will be affected by several factors including time, expense and 
required quantity of DNA. 

The genomic DNA sample is then treated in such a manner that cytosine bases which are un- 
methylated at the 5 '-position are converted to uracil, thymine, or another base which is dis- 
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similar to cytosine in terms of hybridization behavior. This will be understood as "treatment" 
or "pre-treatmenf 5 herein. 

The above described pre-treatment of genomic DNA is preferably carried out with bisulfite 
(hydrogen sulfite, disulfite) and subsequent alkaline hydrolysis which results in a conversion 
of non-methylated cytosine nucleobases to uracil or to another base which is dissimilar to 
cytosine in terms of base pairing behavior. Enclosing the DNA to be analyzed in an agarose 
matrix, thereby preventing the diffusion and renaturation of the DNA (bisulfite only reacts 
with single-stranded DNA), and replacing all precipitation and purification steps with fast 
dialysis (Olek A, et al., A modified and improved method for bisulfite based cytosine meth- 
ylation analysis, Nucleic Acids Res. 24:5064-6, 1996) is one preferred example how to per- 
form said pre-treatment . It is further preferred that the bisulfite treatment is carried out in the 
presence of a radical scavenger or DNA denaturing agent. 

The treated DNA is then analyzed in order to determine the methylation state of the gene 
PITX2 and/or regulatory regions thereof (prior to the treatment) associated with prognosis 
and/or outcome of endocrine treatment . In a further embodiment of the method the methyla- 
tion state of the gene PITX2 and/or regulatory regions thereof and the methylation state of 
one or more genes selected from the group consisting ABCA8, CDK6, ERBB2, ONECUT2, 
PLAU, TBC1D3 and TFF1 and/or regulatory or promoter regions thereof is determined. It is 
particularly preferred that methylation status of a gene panel selected from the group of gene 
panels consisting PITX2, PLAU and TFF1; PITX2 and PLAU; PITX2 and TFF1 is deter- 
mined. It is further preferred that the sequences of said genes as described in the accompany- 
ing sequence listing (see Table 3) are analyzed. 

In the third step of the method, fragments of the pretreated DNA are amplified. Wherein the 
source of the DNA is free DNA from serum, or DNA extracted from paraffin it is particularly 
preferred that the size of the amplificate fragment is between 1 00 and 200 base pairs in length, 
and wherein said DNA source is extracted from cellular sources (e.g. tissues, biopsies, cell 
lines) it is preferred that the amplificate is between 100 and 350 base pairs in length. It is par- 
ticularly preferred that said amplificates comprise at least one 20 base pair sequence com- 
prising at least three CpG dinucleotides. Said amplification is carried out using sets of primer 
oligonucleotides according to the present invention, and a preferably heat-stable polymerase. 
The amplification of several DNA segments can be carried out simultaneously in one and the 
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same reaction vessel, in one embodiment of the method preferably six or more fragments are 
amplified simultaneously. Typically, the amplification is carried out using a polymerase chain 
reaction (PCR). The set of primer oligonucleotides includes at least two oligonucleotides 
whose sequences are each reverse complementary, identical, or hybridize under stringent or 
highly stringent conditions to an at least 1 8-base-pair long segment of the base sequences of 
SEQ ID NO: 250-251, 372-373, SEQ ID Nos: 302-303, 296-297, 214-215, 274-275, 236-237, 
290-291, 228-229, 250-251, 424-425, 418-419, 336-337, 396-397, 358-359, 412-413, 350- 
351 AND SEQ ID NO: 1132 to SEQ ID NO: 1139 and sequences complementary thereto. 

In a preferred embodiment of the method the primers may be selected from the group con- 
sisting to SEQ ID NO: 1 143 to SEQ ID NO: 1 147. 

In an alternate embodiment of the method, the methylation status of preselected CpG posi- 
tions within the nucleic acid sequences comprising SEQ ID NO: 23, SEQ ID NO: 49, SEQ ID 
NO: 46, SEQ ID NO: 5, SEQ ID NO: 35, SEQ ID NO: 16, SEQ ID NO: 43, SEQ ID NO: 12, 
SEQ ID NO: 1130 and SEQ ID NO: 1131 may be detected by use of methylation-specific 
primer oligonucleotides. This technique (MSP) has been described in United States Patent No. 
6,265,171 to Herman. The use of methylation status specific primers for the amplification of 
bisulfite treated DNA allows the differentiation between methylated and unmethylated nucleic 
acids. MSP primers pairs contain at least one primer which hybridizes to a bisulfite treated 
CpG dinucleotide. Therefore, the sequence of said primers comprises at least one CpG , TpG 
or CpA dinucleotide. MSP primers specific for non-methylated DNA contain a "T' at the 3' 
position of the C position in the CpG. Preferably, therefore, the base sequence of said primers 
is required to comprise a sequence having a length of at least 1 8 nucleotides which hybridizes 
to a pretreated nucleic acid sequence according to SEQ ID NO: 250-251, 372-373 and SEQ 
ID NO: 1132, 1133, 1136 and 1137 and sequences complementary thereto, wherein the base 
sequence of said oligomers comprises at least one CpG, tpG or Cpa dinucleotide. In this em- 
bodiment of the method according to the invention it is particularly preferred that the MSP 
primers comprise between 2 and 4 CpG , tpG or Cpa dinucleotides. It is further preferred that 
said dinucleotides are located within the V half of the primer e.g. wherein a primer is 18 
bases in length the specified dinucleotides are located within the first 9 bases form the 3 'end 
of the molecule. In addition to the CpG , tpG or Cpa dinucleotides it is further preferred that 
said primers should further comprise several bisulfite converted bases (i.e. cytosine converted 
to thymine, or on the hybridizing strand, guanine converted to adenosine). In a further pre- 
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ferred embodiment said primers are designed so as to comprise no more than 2 cytosine or 
guanine bases. 

The fragments obtained by means of the amplification can carry a directly or indirectly de- 
tectable label. Preferred are labels in the form of fluorescence labels, radionuclides, or detach- 
able molecule fragments having a typical mass which can be detected in a mass spectrometer. 
Where said labels are mass labels, it is preferred that the labeled amplificates have a single 
positive or negative net charge, allowing for better detectability in the mass spectrometer. The 
detection may be carried out and visualized by means of, e.g., matrix assisted laser desorp- 
tion/ionization mass spectrometry (MALDI) or using electron spray mass spectrometry (ESI). 

Matrix Assisted Laser Desorption/Ionization Mass Spectrometry (MALDI-TOF) is a very 
efficient development for the analysis of biomolecules (Karas and Hillenkamp, Anal Chem., 
60:2299-301, 1988). An analyte is embedded in a light-absorbing matrix. The matrix is 
evaporated by a short laser pulse thus transporting the analyte molecule into the vapor phase 
in an unfragmented manner. The analyte is ionized by collisions with matrix molecules. An 
applied voltage accelerates the ions into a field-free flight tube. Due to their different masses, 
the ions are accelerated at different rates. Smaller ions reach the detector sooner than bigger 
ones. MALDI-TOF spectrometry is well suited to the analysis of peptides and proteins. The 
analysis of nucleic acids is somewhat more difficult (Gut and Beck, Current Innovations and 
Future Trends, 1:147-57, 1995). The sensitivity with respect to nucleic acid analysis is ap- 
proximately 100-times less than for peptides, and decreases disproportionally with increasing 
fragment size. Moreover, for nucleic acids having a multiply negatively charged backbone, 
the ionisation process via the matrix is considerably less efficient. In MALDI-TOF spec- 
trometry, the selection of the matrix plays an eminently important role. For the desorption of 
peptides, several very efficient matrixes have been found which produce a very fine crystalli- 
sation. There are now several responsive matrixes for DNA, however, the difference in sensi- 
tivity between peptides and nucleic acids has not been reduced. This difference in sensitivity 
can be reduced, however, by chemically modifying the DNA in such a manner that it becomes 
more similar to a peptide. For example, phosphorothioate nucleic acids, in which the usual 
phosphates of the backbone are substituted with thiophosphates, can be converted into a 
charge-neutral DNA using simple alkylation chemistry (Gut and Beck, Nucleic Acids Res. 23: 
1367-73, 1995), The coupling of a charge tag to this modified DNA results in an increase in 
MALDI-TOF sensitivity to the same level as that found for peptides. A farther advantage of 
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charge tagging is the increased stability of the analysis against impurities, which makes the 
detection of unmodified substrates considerably more difficult. 

In a particularly preferred embodiment of the method the amplification of step three is carried 
out in the presence of at least one species of blocker oligonucleotides. The use of such blocker 
oligonucleotides has been described by Yu et al., BioTechniques 23:714-720, 1997. The use 
of blocking oligonucleotides enables the improved specificity of the amplification of a sub- 
population of nucleic acids. Blocking probes hybridized to a nucleic acid suppress, or hinder 
the polymerase mediated amplification of said nucleic acid. In one embodiment of the method 
blocking oligonucleotides are designed so as to hybridize to background DNA. In a further 
embodiment of the method said oligonucleotides are designed so as to hinder or suppress the 
amplification of unmethylated nucleic acids as opposed to methylated nucleic acids or vice 
versa. 

Blocking probe oligonucleotides are hybridized to the bisulfite treated nucleic acid concur- 
rently with the PCR primers. PGR amplification of the nucleic acid is terminated at the 5' po- 
sition of the blocking probe, such that amplification of a nucleic acid is suppressed where the 
complementary sequence to the blocking probe is present. The probes may be designed to 
hybridize to the bisulfite treated nucleic acid in a methylation status specific manner. For ex- 
ample, for detection of methylated nucleic acids within a population of unmethylated nucleic 
acids, suppression of the amplification of nucleic acids which are unmethylated at the position 
in question would be carried out by the use of blocking probes comprising a c TpG' at the po- 
sition in question, as opposed to a 'CpG.' In one embodiment of the method the sequence of 
said blocking oligonucleotides should be identical or complementary to molecule is comple- 
mentary or identical to a sequence at least 18 base pairs in length selected from the group 
consisting of SEQ ID NOs: 250-251, 372-373, 1132, 1133, 1136 and 1137 preferably com- 
prising one or more CpG, TpG or CpA dinucleotides. In one embodiment of the method the 
sequence of said oligonucleotides is selected from the group consisting SEQ ID NO: 1148 and 
SEQ ID NO: 1 149 and sequences complementary thereto. 

For PCR methods using blocker oligonucleotides, efficient disruption of polymerase-mediated 
amplification requires that blocker oligonucleotides not be elongated by the polymerase. Pref- 
erably, this is achieved through the use of blockers that are 3'-deoxyoligonucleotides, or oli- 
gonucleotides derivatised at the 3' position with other than a "free" hydroxy! group. For ex- 
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ample, 3 5 -0-acetyl oligonucleotides are representative of a preferred class of blocker mole- 
cule. 

Additionally, polymerase-mediated decomposition of the blocker oligonucleotides should be 
precluded. Preferably, such preclusion comprises either use of a polymerase lacking 5 '-3' 
exonuclease activity, or use of modified blocker oligonucleotides having, for example, thioate 
bridges at the 5 '-termini thereof that render the blocker molecule nuclease-resistant. Particular 
applications may not require such 5 5 modifications of the blocker. For example, if the 
blocker- and primer-binding sites overlap, thereby precluding binding of the primer (e.g., with 
excess blocker), degradation of the blocker oligonucleotide will be substantially precluded. 
This is because the polymerase will not extend the primer toward, and through (in the 5 '-3' 
direction) the blocker - a process that normally results in degradation of the hybridized 
blocker oligonucleotide. 

A particularly preferred blocker/PCR embodiment, for purposes of the present invention and 
as implemented herein, comprises the use of peptide nucleic acid (PNA) oligomers as block- 
ing oligonucleotides. Such PNA blocker oligomers are ideally suited, because they are neither 
decomposed nor extended by the polymerase. 

In one embodiment of the method, the binding site of the blocking oligonucleotide is identical 
to, or overlaps with that of the primer and thereby hinders the hybridization of the primer to 
its binding site. In a further preferred embodiment of the method, two or more such blocking 
oligonucleotides are used. In a particularly preferred embodiment, the hybridization of one of 
the blocking oligonucleotides hinders the hybridization of a forward primer, and the hybridi- 
zation of another of the probe (blocker) oligonucleotides hinders the hybridization of a re- 
verse primer that binds to the amplificate product of said forward primer. 

In an alternative embodiment of the method, the blocking oligonucleotide hybridizes to a lo- 
cation between the reverse and forward primer positions of the treated background DNA, 
thereby hindering the elongation of the primer oligonucleotides. 

It is particularly preferred that the blocking oligonucleotides are present in at least 5 times the 
concentration of the primers. 
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In the fourth step of the method, the amplificates obtained during the third step of the method 
are analyzed in order to ascertain the methylation status of the CpG dinucleotides prior to the 
treatment. 

In embodiments where the amplificates are obtained by means of MSP amplification and/or 
blocking oligonucleotides, the presence or absence of an amplificate is in itself indicative of 
the methylation state of the CpG positions covered by the primers and or blocking oligonu- 
cleotide, according to the base sequences thereof. All possible known molecular biological 
methods may be used for this detection, including, but not limited to gel electrophoresis, se- 
quencing, liquid chromatography, hybridizations, real time PCR analysis or combinations 
thereof. This step of the method further acts as a qualitative control of the preceding steps. 

In the fourth step of the method amplificates obtained by means of both standard and meth- 
ylation specific PCR are further analyzed in order to determine the CpG methylation status of 
the genomic DNA isolated in the first step of the method. This may be carried out by means 
of hybridization-based methods such as, but not limited to, array technology and probe based 
technologies as well as by means of techniques such as sequencing and template directed ex- 
tension. 

In one embodiment of the method, the amplificates synthesized in step three are subsequently 
hybridized to an array or a set of oligonucleotides and/or PNA probes. In this context, the 
hybridization takes place in the following manner: the set of probes used during the hybridi- 
zation is preferably composed of at least 2 oligonucleotides or PNA-oligomers; in the process, 
the amplificates serve as probes which hybridize to oligonucleotides previously bonded to a 
solid phase; the non-hybridized fragments are subsequently removed; said oligonucleotides 
contain at least one base sequence having a length of at least 9 nucleotides which is reverse 
complementary or identical to a segment of the base sequences specified in the SEQ ID NO: 
250-251, 372-373and SEQ ID Nos: 1132, 1133, 1136 and 1137 and the segment comprises at 
least one CpG , TpG or CpA dinucleotide. In further embodiments said oligonucleotides con- 
tain at least one base sequence having a length of at least 9 nucleotides which is reverse com- 
plementary or identical to a segment of the base sequences specified in the SEQ ID NO: 250- 
251, 372^373, SEQ ID NO: 1 132 to SEQ ID NO: 1 139 AND SEQ ID Nos: 302-303, 296-297, 
214-215, 274-275, 236-237, 290-291, 228-229, 250-251, 424-425, 418-419, 336-337, 396- 
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397, 358-359, 412-413, 350-351; and the segment comprises at least one CpG , TpG or CpA 
dinucleotide. 

In a preferred embodiment, said dinucleotide is present in the central third of the oligomer. 
For example, wherein the oligomer comprises one CpG dinucleotide, said dinucleotide is 
preferably the fifth to ninth nucleotide from the 5' -end of a 13-mer. In a further embodiment 
one oligonucleotide exists for the analysis of each CpG dinucleotide within the sequences 
according to SEQ ID NO: 23 and 1130, and the equivalent positions within SEQ ID NO: 250- 
251, 372-373 and SEQ ID NO:1132, 1133, 1136 and 1137. One oligonucleotide exists for the 
analysis of each CpG dinucleotide within the sequence according to SEQ ID NO: 23, SEQ ID 
NOS. 1130, 1131, AND SEQ ID NO: 49, SEQ ID NO: 46, SEQ ID NO: 5, SEQ ID NO: 35, 
SEQ ID NO: 16, SEQ ID NO: 43, SEQ ID NO: 12, and the equivalent positions within SEQ 
ID NO: 250-251, 372-373, SEQ ID NO: 1132 to SEQ ID NO: 1139, AND SEQ ID Nos: 302- 
303, 296-297, 214-215, 274-275, 236-237, 290-291, 228-229, 250-251, 424-425, 418-419, 
336-337, 396-397, 358-359, 412-413, 350-351. Said oligonucleotides may also be present in 
the form of peptide nucleic acids. The non-hybridized amplificates are then removed. The 
hybridized amplificates are detected. In this context, it is preferred that labels attached to the 
amplificates are identifiable at each position of the solid phase at which an oligonucleotide 
sequence is located. 

In yet a further embodiment of the method, the genomic methylation status of the CpG posi- 
tions may be ascertained by means of oligonucleotide probes that are hybridized to the bisul- 
fite treated DNA concurrently with the PCR amplification primers (wherein said primers may 
either be methylation specific or standard). 

A particularly preferred embodiment of this method is the use of fluorescence-based Real 
Time Quantitative PCR (Heid et ah, Genome Res. 6:986-994, 1996; also see United States 
Patent No. 6,331,393). There are two preferred embodiments of utilizing this method. One 
embodiment, known as the TaqMan™ assay employs a dual-labeled fluorescent oligonucleo- 
tide probe. The TaqMan™ PCR reaction employs the use of a non-extendible interrogating 
oligonucleotide, called a TaqMan™ probe, which is designed to hybridize to a CpG-rich se- 
quence located between the forward and reverse amplification primers. The TaqMan™ probe 
further comprises a fluorescent "reporter moiety" and a "quencher moiety" covalently bound 
to linker moieties (e.g., phosphoramidites) attached to the nucleotides of the TaqMan™ oli- 
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gonucleotide. Hybridized probes are displaced and broken down by the polymerase of the 
amplification reaction thereby leading to an increase in fluorescence. For analysis of methyla- 
tion within nucleic acids subsequent to bisulfite treatment, it is required that the probe be 
methylation specific, as described in United States Patent No. 6,331,393, (hereby incorporated 
by reference in its entirety) also known as the MethyLight assay. The second preferred em- 
bodiment of this MethyLight technology is the use of dual-probe technology (Lightcycler®), 
each probe carrying donor or recipient fluorescent moieties, hybridization of two probes in 
proximity to each other is indicated by an increase or fluorescent amplification primers. Both 
these techniques may be adapted in a manner suitable for use with bisulfite treated DNA, and 
moreover for methylation analysis within CpG dinucleotides. 

Also any combination of these probes or combinations of these probes with other known 
probes may be used. 

In a further preferred embodiment of the method, the fourth step of the method comprises the 
use of template-directed oligonucleotide extension, such as MS-SNuPE as described by Gon- 
zalgo and Jones, Nucleic Acids Res. 25:2529-2531, 1997. In said embodiment it is preferred 
that the methylation specific single nucleotide extension primer (MS-SNuPE primer) is iden- 
tical or complementary to a sequence at least nine but preferably no more than twenty five 
nucleotides in length of one or more of the sequences taken from the group of SEQ ID NO: 
250-251, 372-373and SEQ ID Nos: 1132, 1133, 1136 and 1137. However it is preferred to 
use fluorescently labeled nucleotides, instead of radiolabeled nucleotides. 

In yet a further embodiment of the method, the fourth step of the method comprises sequenc- 
ing and subsequent sequence analysis of the amplificate generated in the third step of the 
method (Sanger F., et al., Proc Natl Acad Sci USA 74:5463-5467, 1977). 

In the most preferred embodiment of the methylation analysis method the genomic nucleic 
acids are isolated and treated according to the first three steps of the method outlined above, 
namely: 

a) obtaining, from a subject, a biological sample having subject genomic DNA; 

b) extracting or otherwise isolating the genomic DNA; 
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c) treating the genomic DNA of b), or a fragment thereof, with one or more reagents to con- 
vert cytosine bases that are unmethylated in the 5 -position thereof to uracil or to another base 
that is detectably dissimilar to cytosine in terms of hybridization properties; and wherein 

d) amplifying subsequent to treatment in c) is carried out in a methylation specific manner, 
namely by use of methylation specific primers or blocking oligonucleotides, and further 
wherein 

e) detecting of the amplificates is carried out by means of a real-time detection probe, as de- 
scribed above. 

Preferably, where the subsequent amplification of d) is carried out by means of methylation 
specific primers, as described above, said methylation specific primers comprise a sequence 
having a length of at least 9 nucleotides which hybridizes to a treated nucleic acid sequence 
according to one of SEQ ID NO: 250-251, 372-373and SEQ ID Nos: 1132, 1133, 1136 and 
1137 and sequences complementary thereto, wherein the base sequence of said oligomers 
comprises at least one CpG dinucleotide. Additionally, further methylation specific primers 
may also be used for the analysis of a gene panel as described above wherein said primers 
comprise a sequence having a length of at least 9 nucleotides which hybridizes to a treated 
nucleic acid sequence according to one of SEQ ID Nos: 302-303, 296-297, 214-215, 274-275, 
236-237, 290-291, 228-229, 250-251, 424-425, 418-419, 336-337, 396-397, 358-359, 412- 
413, 350-351 and SEQ ID Nos: 1134, 1135, 1138 and 1139 and sequences complementary 
thereto, wherein the base sequence of said oligomers comprises at least one CpG dinucleotide. 

In an alternative most preferred embodiment of the method, the subsequent amplification of 
d) is carried out in the presence of blocking oligonucleotides, as described above. It is par- 
ticularly preferred that said blocking oligonucleotides comprise a sequence having a length of 
at least 9 nucleotides which hybridizes to a treated nucleic acid sequence according to one of 
SEQ ID NO: 250-251, 372-373, SEQ ID Nos: 1132, 1133, 1136 and 1137 and sequences 
complementary thereto, wherein the base sequence of said oligomers comprises at least one 
CpG, TpG or CpA dinucleotide. 

Additionally, further blocking oligonucleotides may also be used for the analysis of a gene 
panel as described above wherein said blocking oligonucleotides comprising a sequence hav- 
ing a length of at least 9 nucleotides which hybridizes to a treated nucleic acid sequence ac- 
cording to one of SEQ ID Nos: 302-303, 296-297, 214-215, 274-275, 236-237, 290-291, 228- 
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229, 250-251, 424-425, 418-419, 336-337, 396-397, 358-359, 412-413, 350-351 and SEQ ID 
Nos: 1134, 1135, 1138 and 1139 and sequences complementary thereto, wherein the base 
sequence of said oligomers comprises at least one CpG, TpG or CpA dinucleotide. 

Step e) of the method, namely the detection of the specific amplificates indicative of the 
methylation status of one or more CpG positions according to SEQ ID NO: 250-251, 372-373, 
SEQ ID NO: 1132 to SEQ ID NO: 1139, AND SEQ ID Nos: 302-303, 296-297, 214-215, 
274-275, 236-237, 290-291, 228-229, 250-251, 424-425, 418-419, 336-337, 396-397, 358- 
359, 412-413, 350-351, and most preferably SEQ ID NO: 250-251, 372-373and SEQ ID Nos: 
1132, 1133, 1136 and 1137 is carried out by means of real-time detection methods as de- 
scribed above. 

Additional embodiments of the invention provide a method for the analysis of the methylation 
status of the gene PITX2 and/or regulatory regions thereof without the need for pre-treatment. 
Furthermore said method may also be used for the methylation analysis of the gene PITX2 
and/or regulatory regions thereof and the methylation state of one or more genes selected 
from the group consisting ABC AS, CDK6, ERBB2, ONECUT2, PLAU, TBC1D3, TFF1 
and/or regulatory or promoter regions thereof is determined. It is particularly preferred that 
methylation status of a gene panel selected from the group of gene panels consisting PITX2, 
PLAU and TFF1; PITX2 and PLAU; PITX2 and TFF1 is determined. 

In the first step of such additional embodiments, the genomic DNA sample is isolated from 
tissue or cellular sources. Preferably, such sources include cell lines, histological slides, bi- 
opsy tissue, body fluids, or breast tumor tissue embedded in paraffin. Extraction may be by 
means that are standard to one skilled in the art, including but not limited to the use of deter- 
gent ly sates, sonification and vortexing with glass beads. Once the nucleic acids have been 
extracted, the genomic double- stranded DNA is used in the analysis. 

In a preferred embodiment, the DNA may be cleaved prior to the treatment, and this may be 
by any means standard in the state of the art, but preferably with methylation-sensitive re- 
striction endonucleases. 
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In the second step, the DNA is then digested with one or more methylation sensitive restric- 
tion enzymes. The digestion is carried out such that hydrolysis of the DNA at the restriction 
site is informative of the methylation status of a specific CpG dinucleotide. 

In the third step, which is optional but a preferred embodiment, the restriction fragments are 
amplified. This is preferably carried out using a polymerase chain reaction, and said amplifi- 
cates may carry suitable detectable labels as discussed above, namely fluorophore labels, ra- 
dionuclides and mass labels. 

In the fourth step the amplificates are detected. The detection may be by any means standard 
in the art, for example, but not limited to, gel electrophoresis analysis, hybridization analysis, 
incorporation of detectable tags within the PGR products, DNA array analysis, MALDI or 
ESI analysis. 

In the final step of the method the prognosis and/or predicting outcome of endocrine treatment 
is determined. Preferably, the correlation of the expression level of the genes with the progno- 
sis and/or predicting outcome of endocrine treatment is done substantially without human 
intervention. Poor prognosis and/or predicting outcome of endocrine treatment is determined 
by aberrant levels of mRNA and/or protein, and hypermethylation. It is particularly preferred 
that said hypermethylation is above average or above median of said disease in said specific 
setting. 

It is particularly preferred that the classification of the sample is carried out by algorithmic 
means. 

In one embodiment machine learning predictors are trained on the methylation patterns at the 
investigated CpG sites of the samples with known status. A selection of the CpG positions 
which are discriminative for the machine learning predictor are used in the panel. In a par- 
ticularly preferred embodiment of the method, both methods are combined; that is, the ma- 
chine learning classifier is trained only on the selected CpG positions that are significantly 
differentially methylated between the classes according to the statistical analysis. 

The development of algorithmic methods for the classification of a sample based on the 
methylation status of the CpG positions within the panel are demonstrated in the examples. 



WO 2005/059172 



66- 



PCT/EP2004/014170 



The disclosed invention provides treated nucleic acids, derived from genomic SEQ ID NO: 
23, SEQ ID NO: 1130, SEQ ID NO: 1131 AND SEQ ID NO: 49, SEQ ID NO: 46, SEQ ID 
NO: 5, SEQ ID NO: 35, SEQ ID NO: 16, SEQ ID NO: 43, SEQ ID NO: 12, wherein the 
treatment is suitable to convert at least one unmethylated cytosine base of the genomic DNA 
sequence to uracil or another base that is detectably dissimilar to cytosine in terms of hybridi- 
zation. The genomic sequences in question may comprise one, or more, consecutive or ran- 
dom methylated CpG positions. Said treatment preferably comprises use of a reagent selected 
from the group consisting of bisulfite, hydrogen sulfite, disulfite, and combinations thereof. In 
a preferred embodiment of the invention, the objective comprises analysis of a non-naturally 
occurring modified nucleic acid comprising a sequence of at least 1 6 contiguous nucleotide 
bases in length of a sequence selected from the group consisting of SEQ ID NO: 23, SEQ ID 
NO: 1130, SEQ ID NO: 1131 AND SEQ ID NO: 49, SEQ ID NO: 46, SEQ ID NO: 5, SEQ 
ID NO: 35, SEQ ID NO: 16, SEQ ID NO: 43, SEQ ID NO: 12, wherein said sequence com- 
prises at least one CpG, TpA or CpA dinucleotide and sequences complementary thereto. The 
sequences of SEQ ID NO: 250-251, 372-373, SEQ ID NO: 1 132 to SEQ ID NO: 1 139 AND 
SEQ ID Nos: 302-303, 296-297, 214-215, 274-275, 236-237, 290-291, 228-229, 250-251, 
424-425, 418-419, 336-337, 396-397, 358-359, 412-413, 350-351 provide non-naturally oc- 
curring modified versions of the nucleic acid according to SEQ ID NO: 23, SEQ ID NO: 
1130, SEQ ID NO: 1131 AND SEQ ID NO: 49, SEQ ID NO: 46, SEQ ID NO: 5, SEQ ID 
NO: 35, SEQ ID NO: 16, SEQ ID NO: 43, SEQ ID NO: 12, wherein the modification of each 
genomic sequence results in the synthesis of a nucleic acid having a sequence that is unique 
and distinct from said genomic sequence as follows. For each sense strand genomic DNA, 
e.g., SEQ ID NO: 23, four converted versions are disclosed. A first version wherein "C" to 
"T," but "CpG" remains "CpG" (i.e., corresponds to case where, for the genomic sequence, 
all "C" residues of CpG dinucleotide sequences are methylated and are thus not converted); a 
second version discloses the complement of the disclosed genomic DNA sequence (i.e. an- 
tisense strand), wherein "C" to "T," but "CpG" remains "CpG" (i.e., corresponds to case 
where, for all "C" residues of CpG dinucleotide sequences are methylated and are thus not 
converted). The 'upmethylated' converted sequences of SEQ ID NO: 23, SEQ ID NO: 1130, 
SEQ ID NO: 1131 AND SEQ ID NO: 49, SEQ ID NO: 46, SEQ ID NO: 5, SEQ ID NO: 35, 
SEQ ID NO: 16, SEQ ID NO: 43, SEQ ID NO: 12 correspond to SEQ ID NO: 250-251, 372- 
373, SEQ ID NO: 1132 to SEQ ID NO: 1139 AND SEQ ID Nos: 302-303, 296-297, 214-215, 
274-275, 236-237, 290-291, 228-229, 250-251, 424-425, 418-419, 336-337, 396-397, 358- 
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359, 412-413, 350-351 . A third chemically converted version of each genomic sequences is 
provided, wherein "C" to "T" for all "C" residues, including those of "CpG" dinucleotide 
sequences (i.e., corresponds to case where, for the genomic sequences, all "C" residues of 
CpG dinucleotide sequences are unmethylated); a final chemically converted version of each 
sequence, discloses the complement of the disclosed genomic DNA sequence (i.e. antisense 
strand), wherein "C" to "T" for all "C" residues, including those of "CpG" dinucleotide se- 
quences (i.e., corresponds to case where, for the complement (antisense strand) of each geno- 
mic sequence, all "C" residues of CpG dinucleotide sequences are unmethylated). The 
'downmethylated' converted sequences of SEQ ID NO: 23, SEQ ID NO: 1130, SEQ ID NO: 
1131 AND SEQ ID NO: 49, SEQ ID NO: 46, SEQ ID NO: 5, SEQ ID NO: 35, SEQ ID NO: 
16, SEQ ID NO: 43, SEQ ID NO: 12 correspond to SEQ ID NO: 250-251, 372-373, SEQ ID 
NO: 1132 to SEQ ID NO: 1139 AND SEQ ID Nos: 302-303, 296-297, 214-215, 274-275, 
236-237, 290-291, 228-229, 250-251, 424-425, 418-419, 336-337, 396-397, 358-359, 412- 
413,350-351 . 

The invention further discloses oligonucleotide or oligomer for detecting the cytosine meth- 
ylation state within genomic or pre-treated DNA, according to SEQ ID NO: 23, SEQ ID 
NO: 1 130 to SEQ ID NO: 1 139 AND SEQ ID NO: 49, SEQ ID NO: 46, SEQ ID NO: 5, SEQ 
ID NO: 35, SEQ ID NO: 16, SEQ ID NO: 43, SEQ ID NO: 12. Said oligonucleotide or oli- 
gomer comprising a nucleic acid sequence having a length of at least nine (9) nucleotides 
which hybridizes, under moderately stringent or stringent conditions (as defined herein 
above), to a treated nucleic acid sequence according to SEQ ID NO: 250-251, 372-373, SEQ 
ID NO: 1132 to SEQ ID NO: 1139 AND SEQ ID Nos: 302-303, 296-297, 214-215, 274-275, 
236-237, 290-291, 228-229, 250-251, 424-425, 418-419, 336-337, 396-397, 358-359, 412- 
413, 350-351 and/or sequences complementary thereto, or to a genomic sequence according 
to SEQ ID NO: 23, SEQ ID NO: 1130, SEQ ID NO: 1131 AND SEQ ID NO: 49, SEQ ID 
NO: 46, SEQ ID NO: 5, SEQ ID NO: 35, SEQ ID NO: 16, SEQ ID NO: 43, SEQ ID NO: 12 
and/or sequences complementary thereto. 

Thus, the present invention includes nucleic acid molecules (e.g., oligonucleotides and pep- 
tide nucleic acid (PNA) molecules (PNA-oligomers)) that hybridize under moderately strin- 
gent and/or stringent hybridization conditions to all or a portion of the sequences SEQ ID NO: 
250-251, 372-373, SEQ ID NO: 1132 to SEQ ID NO: 1139 AND SEQ ID Nos: 302-303, 296- 
297, 214-215, 274-275, 236-237, 290-291, 228-229, 250-251, 424-425, 418-419, 336-337, 
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396-397, 358-359, 412-413, 350-351 , or to the complements thereof. The hybridizing portion 
of the hybridizing nucleic acids is typically at least 9, 15, 20, 25, 30 or 35 nucleotides in 
length. However, longer molecules have inventive utility, and are thus within the scope of the 
present invention. 

Preferably, the hybridizing portion of the inventive hybridizing nucleic acids is at least 95%, 
or at least 98%, or 100% identical to the sequence, or to a portion thereof of SEQ ID NO: 
250-251, 372-373, SEQ ID NO: 1132 to SEQ ID NO: 1139 AND SEQ ID Nos: 302-303, 296- 
297, 214-215, 274-275, 236-237, 290-291, 228-229, 250-251, 424-425, 418-419, 336-337, 
396-397, 358-359, 412-413, 350-351 , or to the complements thereof. 

Hybridizing nucleic acids of the type described herein can be used, for example, as a primer 
(e.g., a PGR primer), or a diagnostic and/or prognostic probe or primer. Preferably, hybridiza- 
tion of the oligonucleotide probe to a nucleic acid sample is performed under stringent condi- 
tions and the probe is 100% identical to the target sequence. Nucleic acid duplex or hybrid 
stability is expressed as the melting temperature or Tm, which is the temperature at which a 
probe dissociates from a target DNA. This melting temperature is used to define the required 
stringency conditions. 

For target sequences that are related and substantially identical to the corresponding sequence 
of SEQ ID NO: 23, SEQ ID NO: 1130, SEQ ID NO: 1131 AND SEQ ID NO: 49, SEQ ID 
NO: 46, SEQ ID NO: 5, SEQ ID NO: 35, SEQ ID NO: 16, SEQ ID NO: 43, SEQ ID NO: 12 
(such as allelic variants and SNPs), rather than identical, it is useful to first establish the low- 
est temperature at which only homologous hybridization occurs with a particular concentra- 
tion of salt (e.g., SSC or SSPE). Then, assuming that 1% mismatching results in a 1°C de- 
crease in the Tm, the temperature of the final wash in the hybridisation reaction is reduced 
accordingly (for example, if sequences having > 95% identity with the probe are sought, the 
final wash temperature is decreased by 5°C). In practice, the change in Tm can be between 
0.5°C and 1.5°C per 1% mismatch. 

Examples of inventive oligonucleotides of length X (in nucleotides), as indicated by polynu- 
cleotide positions with reference to, e.g., SEQ ID NO:23, include those corresponding to sets 
(sense and antisense sets) of consecutively overlapping oligonucleotides of length X, where 
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the oligonucleotides within each consecutively overlapping set (corresponding to a given X 

value) are defined as the finite set of Z oligonucleotides from nucleotide positions: 

nto (n + (X-l)); 

where n=l, 2, 3,...(Y-(X-1)); 

where Y equals the length (nucleotides or base pairs) of SEQ ID NO: 23 (9001); 

where X equals the common length (in nucleotides) of each oligonucleotide in the set (e.g., 

X=20 for a set of consecutively overlapping 20-mers); and 

where the number (Z) of consecutively overlapping oligomers of length X for a given SEQ ID 
NO of length Y is equal to Y-(X-1). For example Z= 9001-19- 8,982 for either sense or an- 
tisense sets of SEQ ID NO: 23 ? where X-20. 

Preferably, the set is limited to those oligomers that comprise at least one CpG, TpG or CpA 
dinucleotide. 

Examples of inventive 20-mer oligonucleotides include the following set of oligomers (and 
the antisense set complementary thereto), indicated by polynucleotide positions with refer- 
ence to SEQ ID NO: 23: 1-20, 2-21, 3-22, 4-23, 5-24, and 8,982 -9,001. 

Preferably, the set is limited to those oligomers that comprise at least one CpG, TpG or CpA 
dinucleotide. 

Likewise, examples of inventive 25-mer oligonucleotides include the following set of oli- 
gomers (and the antisense set complementary thereto), indicated by polynucleotide positions 
with reference to SEQ ID NO: 23: 1-25, 2-26, 3-27, 4-28, 5-29, and 8,977-9,001. 

Preferably, the set is limited to those oligomers that comprise at least one CpG, TpG or CpA 
dinucleotide. 

The present invention encompasses, for each of SEQ ID NO: 23, -250-251, , 372-373, SEQ 
ID NO: 1130 to AND SEQ ID NO: 49, SEQ ID NO: 46, SEQ ID NO: 5, SEQ ID NO: 35, 
SEQ ID NO: 16, SEQ ID NO: 43, SEQ ID NO: 12 (sense and antisense), multiple consecu- 
tively overlapping sets of oligonucleotides or modified oligonucleotides of length X, where, 
e.g., X= 9, 10, 17, 20, 22, 23, 25, 27, 30 or 35 nucleotides. 
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The oligonucleotides or oligomers according to the present invention constitute effective tools 
useful to ascertain genetic and epigenetic parameters of the genomic sequence corresponding 
to SEQ ID NO: 23, SEQ ID NO: 1130, SEQ ID NO: 1131 AND SEQ ID NO: 49, SEQ ID 
NO: 46, SEQ ID NO: 5, SEQ ID NO: 35, SEQ ID NO: 16, SEQ ID NO: 43, SEQ ID NO: 12. 
Preferred sets of such oligonucleotides or modified oligonucleotides of length X are those 
consecutively overlapping sets of oligomers corresponding to SEQ ID NO: 23, 250-251, 372- 
373, SEQ ID NO: 1130 AND SEQ ID NO: 49, SEQ ID NO: 46, SEQ ID NO: 5, SEQ ID NO: 
35, SEQ ID NO: 16, SEQ ID NO: 43, SEQ ID NO: 12 (and to the complements thereof). 
Preferably, said oligomers comprise at least one CpG, TpG or CpA dinucleotide. 

Particularly preferred oligonucleotides or oligomers according to the present invention are 
those in which the cytosine of the CpG dinucleotide (or of the corresponding converted TpG 
or CpA dinculeotide) sequences is within the middle third of the oligonucleotide; that is, 
where the oligonucleotide is, for example, 13 bases in length, the CpG, TpG or CpA dinu- 
cleotide is positioned within the fifth to ninth nucleotide from the 5 '-end. 

The oligonucleotides of the invention can also be modified by chemically linking the oligonu- 
cleotide to one or more moieties or conjugates to enhance the activity, stability or detection of 
the oligonucleotide. Such moieties or conjugates include chromophores, fluorophores, lipids 
such as cholesterol, cholic acid, thioether, aliphatic chains, phospholipids, polyamines, poly- 
ethylene glycol (PEG), palmityl moieties, and others as disclosed in, for example, United 
States Patent Numbers 5,514,758, 5,574,142, 5,585,481, 5,587,371, 5,597,696 and 5,958,773. 
The probes may also exist in the form of a PNA (peptide nucleic acid) which has particularly 
preferred pairing properties. Thus, the oligonucleotide may include other appended groups 
such as peptides, and may include hybridization-triggered cleavage agents (Krol et al, Bio- 
Techniques 6:958-976, 1988) or intercalating agents (Zon, Pharm. Res. 5:539-549, 1988). To 
this end, the oligonucleotide may be conjugated to another molecule, e.g., a chromophore, 
fluorophor, peptide, hybridization-triggered cross-linking agent, transport agent, hybridisa- 
tion-triggered cleavage agent, etc. 

The oligonucleotide may also comprise at least one art-recognized modified sugar and/or base 
moiety, or may comprise a modified backbone or non-natural internucleoside linkage. 
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The oligonucleotides or oligomers according to particular embodiments of the present inven- 
tion are typically used in 'sets,' which contain at least one oligomer for analysis of each of the 
CpG dinucleotides of genomic sequences SEQ ID NO: 23, SEQ ID NO: 1130, SEQ ID NO: 
1131 AND SEQ ID NO: 49, SEQ ID NO: 46, SEQ ID NO: 5, SEQ ID NO: 35, SEQ ID NO: 
16, SEQ ID NO: 43, SEQ ID NO: 12 and sequences complementary thereto, or to the corre- 
sponding CpG, TpG or CpA dinucleotide within a sequence of the treated nucleic acids ac- 
cording to SEQ ID NO: 250-251, 372-373, SEQ ID NO: 1132 to SEQ ID NO: 1139 AND 
SEQ ID Nos: 302-303, 296-297, 214-215, 274-275, 236-237, 290-291, 228-229, 250-251, 
424-425, 418-419, 336-337, 396-397, 358-359, 412-413, 350-351 and sequences comple- 
mentary thereto. However, it is anticipated that for economic or other factors it may be prefer- 
able to analyze a limited selection of the CpG dinucleotides within said sequences, and the 
content of the set of oligonucleotides is altered accordingly. 

Therefore, in particular embodiments, the present invention provides a set of at least two (2) 
(oligonucleotides and/or PNA-oligomers) useful for detecting the cytosine methylation state 
of treated genomic DNA (SEQ ID NO: 250-251, 372-373, SEQ ID NO: 1 132 to SEQ ID NO: 
1139 AND SEQ ID Nos: 302-303, 296-297, 214-215, 274-275, 236-237, 290-291, 228-229, 
250-251, 424-425, 418-419, 336-337, 396-397, 358-359, 412-413, 350-351 ), or in genomic 
DNA (SEQ ID NO: 23, SEQ ID NO: 1130, SEQ ID NO: 1 131 AND SEQ ID NO: 49, SEQ ID 
NO: 46, SEQ ID NO: 5, SEQ ID NO: 35, SEQ ID NO: 16, SEQ ID NO: 43, SEQ ID NO: 12 
and sequences complementary thereto). These probes enable diagnosis, and/or classification 
of genetic and epigenetic parameters of lung cell proliferative disorders. The set of oligomers 
may also be used for detecting single nucleotide polymorphisms (SNPs) in treated genomic 
DNA(SEQ ID NO: 250-251, 372-373, SEQ ID NO: 1 132 to SEQ ID NO: 1 139 AND SEQ ID 
Nos: 302-303, 296-297, 214-215, 274-275, 236-237, 290-291, 228-229, 250-251, 424-425, 
418-419, 336-337, 396-397, 358-359, 412-413, 350-351 ), or in genomic DNA (SEQ ID NO: 
23, SEQ ID NO: 1130, SEQ ID NO: 1131 AND SEQ ID NO: 49, SEQ ID NO: 46, SEQ ID 
NO: 5, SEQ ID NO: 35, SEQ ID NO: 16, SEQ ID NO: 43, SEQ ID NO: 12 and sequences 
complementary thereto). 

In preferred embodiments, at least one, and more preferably all members of a set of oligonu- 
cleotides is bound to a solid phase. 
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In further embodiments, the present invention provides a set of at least two (2) oligonucleo- 
tides that are used as 'primer' oligonucleotides for amplifying DNA sequences of one of SEQ 
ID NO: 250-251, 372-373, SEQ ID NO: 1 132 to SEQ ID NO: 1 139 AND SEQ ID Nos: 302- 
303, 296-297, 214-215, 274-275, 236-237, 290-291, 228-229, 250-251, 424-425, 418-419, 
336-337, 396-397, 358-359, 412-413, 350-351 and sequences complementary thereto, or 
segments thereof. 

It is anticipated that the oligonucleotides may constitute all or part of an "array" or "DNA 
chip" (i.e., an arrangement of different oligonucleotides and/or PNA-oligomers bound to a 
solid phase). Such an array of different oligonucleotide- and/or PNA-oligomer sequences can 
be characterized, for example, in that it is arranged on the solid phase in the form of a rectan- 
gular or hexagonal lattice. The solid-phase surface may be composed of silicon, glass, poly- 
styrene, aluminium, steel, iron, copper, nickel, silver, or gold. Nitrocellulose as well as plas- 
tics such as nylon, which can exist in the form of pellets or also as resin matrices, may also be 
used. An overview of the prior art in oligomer array manufacturing can be gathered from a 
special edition of Nature Genetics (Nature Genetics Supplement, Volume 21, January 1999, 
and from the literature cited therein). Fluorescently labeled probes are often used for the scan- 
ning of immobilized DNA arrays. The simple attachment of Cy3 and Cy5 dyes to the 5'-OH 
of the specific probe are particularly suitable for fluorescence labels. The detection of the 
fluorescence of the hybridized probes may be carried out, for example, via a confocal micro- 
scope. Cy3 and Cy5 dyes, besides many others, are commercially available. 

It is also anticipated that the oligonucleotides, or particular sequences thereof, may constitute 
all or part of an "virtual array" wherein the oligonucleotides, or particular sequences thereof, 
are used, for example, as 'specifiers' as part of, or in combination with a diverse population of 
unique labeled probes to analyze a complex mixture of analytes. Such a method, for example 
is described in US 2003/0013091 (United States serial number 09/898,743, published 16 
January 2003). In such methods, enough labels are generated so that each nucleic acid in the 
complex mixture (i.e., each analyte) can be uniquely bound by a unique label and thus de- 
tected (each label is directly counted, resulting in a digital read-out of each molecular species 
in the mixture). 

The described invention further provides a composition of matter useful for providing a prog- 
nosis and/or prediction of outcome of endocrine treatment of breast cancer patients. Said 
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composition comprising at least one nucleic acid 1 8 base pairs in length of a segment of the 
nucleic acid sequence disclosed in SEQ ID NO: 250-251, 372-373, 1132, 1133, 1136 and 
1137, and one or more substances taken from the group comprising : magnesium chloride, 
dNTP, taq polymerase, bovine serum albumen, an oligomer in particular an oligonucleotide or 
peptide nucleic acid (PN A) -oligomer, said oligomer comprising in each case at least one base 
sequence having a length of at least 9 nucleotides which is complementary to, or hybridizes 
under moderately stringent or stringent conditions to a pretreated genomic DNA according to 
one of the SEQ ID NO: 250-251, 372-373 and SEQ ID NO: 1132, 1133, 1136 and 1137 and 
sequences complementary thereto. It is preferred that said composition of matter comprises a 
buffer solution appropriate for the stabilization of said nucleic acid in an aqueous solution and 
enabling polymerase based reactions within said solution. Suitable buffers are known in the 
art and commercially available. 

Moreover, an additional aspect of the present invention is a kit comprising, for example: a 
bisulfite-containing reagent as well as at least one oligonucleotide whose sequences in each 
case correspond, are complementary, or hybridize under stringent or highly stringent condi- 
tions to a 18-base long segment of the sequences SEQ ID NO: 250-251, 372-373, 1 132, 1 133, 
1 136 and 1 137. Said kit may further comprise at least one oligonucleotide whose sequences in 
each case correspond, are complementary, or hybridize under stringent or highly stringent 
conditions to a 18-base long segment of the sequences SEQ ID Nos: 302-303, 296-297, 214- 
215, 274-275, 236-237, 290-291, 228-229, 250-251, 424-425, 418-419, 336-337, 396-397, 
358-359, 412-413, 350-351. Said kit may further comprise instructions for carrying out and 
evaluating the described method. In a further preferred embodiment, said kit may further 
comprise standard reagents for performing a CpG position-specific methylation analysis, 
wherein said analysis comprises one or more of the following techniques: MS-SNuPE, MSP, 
MethyLight®, HeavyMethyl® , COBRA, and nucleic acid sequencing. However, a kit along 
the lines of the present invention can also contain only part of the aforementioned compo- 
nents. 

Typical reagents (e.g., as might be found in a typical COBRA-based kit) for COBRA analysis 
may include, but are not limited to: PCR primers for specific gene (or methylation-altered 
DNA sequence or CpG island); restriction enzyme and appropriate buffer; gene-hybridization 
oligo; control hybridization oligo; kinase labeling kit for oligonucleotide probe; and radioac- 
tive nucleotides. Additionally, bisulfite conversion reagents may include: DNA denaturation 
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buffer; sulfonation buffer; DNA recovery reagents or kits (e.g., precipitation, ultrafiltration, 
affinity column); desulfonation buffer; and DNA recovery components. 

Typical reagents (e.g., as might be found in a typical MethyLight®-based kit) for Meth- 
yLight® analysis may include, but are not limited to: PCR primers for specific gene (or meth- 
ylation-altered DNA sequence or CpG island); TaqMan® probes; optimized PCR buffers and 
deoxynucleotides; and Taq polymerase. 

Typical reagents (e.g., as might be found in a typical Ms-SNuPE-based kit) for Ms-SNuPE 
analysis may include, but are not limited to: PCR primers for specific gene (or methylation- 
altered DNA sequence or CpG island); optimized PCR buffers and deoxynucleotides; gel ex- 
traction kit; positive control primers; Ms-SNuPE primers for specific gene; reaction buffer 
(for the Ms-SNuPE reaction); and radioactive nucleotides. Additionally, bisulfite conversion 
reagents may include: DNA denaturation buffer; sulfonation buffer; DNA recovery regents or 
kit (e.g., precipitation, ultrafiltration, affinity column); desulfonation buffer; and DNA recov- 
ery components. 

Typical reagents (e.g., as might be found in a typical MSP-based kit) for MSP analysis may 
include, but are not limited to: methylated and unmethylated PCR primers for specific gene 
(or methylation-altered DNA sequence or CpG island), optimized PCR buffers and deoxynu- 
cleotides, and specific probes. 

While the present invention has been described with specificity in accordance with certain of 
its preferred embodiments, the following examples and figures serve only to illustrate the in- 
vention and is not intended to limit the invention within the principles and scope of the broad- 
est interpretations and equivalent configurations thereof. 

While the present invention has been described with specificity in accordance with certain of 
its preferred embodiments, the following examples and figures serve only to illustrate the in- 
vention and is not intended to limit the invention within the principles and scope of the broad- 
est interpretations and equivalent configurations thereof. 

Figure 1 shows a preferred application of the method according to the invention. The X axis 
shows the tumour(s) mass, wherein the line ! 3' shows the limit of detectability. The Y-axis 
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shows time. Accordingly said figure illustrates a simplified model of endocrine treatment of 
an Stage 1-3 breast tumour wherein primary treatment was surgery (at point 1), followed by 
adjuvant therapy with Tamoxifen. In a first scenario a responder to treatment (4) is shown as 
remaining below the limit of detectability for the duration of the observation. A non responder 
to the treatment (5) has a period of disease free survival (2) followed by relapse when the car- 
cinoma mass reaches the level of detectability. 

Figure 2 shows another preferred application of the method according to the invention. The X 
axis shows the tumour(s) mass, wherein the line '3' shows the limit of detectability. The Y- 
axis shows time. Accordingly said figure illustrates a simplified model of Endocrine treatment 
of an late stage breast tumour wherein primary treatment was surgery (at point 1), followed by 
relapse which is treated by Tamoxifen (2). In a first scenario a responder to treatment (4) is 
shown as remaining below the limit of detectability for the duration of the observation. A non 
responder to the treatment (5) does not recover from the relapse. 

Figures 3 to 45 show the Kaplan-Meier estimated disease- free survival curves for single genes 
or oligonucleotide positions. The black plot shows the proportion of disease free patients in 
the population with above median methylation levels, the grey plot shows the proportion of 
disease free patients in the population with below median methylation levels 

Figure 46 shows the methylation analysis of CpG islands according to Example 1 . CpG is- 
lands per gene were grouped and their correlation with objective response determined by Ho- 
telling's T 2 statistics. Black dots indicate the P-value of the indicated gene. The 20 most in- 
formative genes, ranked from left to right with increasing P-value, are shown. The top dotted 
line marks the uncorrected significance value (P < 0.05). The lower dotted line marks signifi- 
cance after false discovery rate correction of 25%. All genes with a P-value smaller or equal 
to the gene with the largest P-value that is below the lower line (in this case COX7A2L) are 
considered significant. The FDR correction chosen guarantees that the identified genes are 
with 75% chance true discoveries. 

Figures 48 shows a ranked matrix of the best 1 1 amplificates of data obtained according to 
Example 1 (Metastatic setting, limited sample set). P- values were calculated from Likelihood 
ratio (LR) tests from multivariate logistic regression models. The figure is shown in grey- 
scale, wherein the most significant CpG positions are at the bottom of the matrix with signify 
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cance decreasing towards the top. Black indicates total methylation at a given CpG position, 
white represents no methylation at the particular position, with degrees of methylation repre- 
sented in grey, from light (low proportion of methylation) to dark (high proportion of meth- 
ylation). Each row represents one specific CpG position within a gene and each column 
shows the methylation profile for the different CpGs for one sample. The p-values for the 
individual CpG positions are shown on the right side. The p-values are the probabilities that 
the observed distribution occurred by chance in the data set. 

Figure 49 shows a ranked matrix of some of the best markers obtained according to Example 
1 (Metastatic setting, limited sample set). P-values were calculated from Likelihood ratio 
(LR) tests from univariate logistic regression models. The figure is shown in greyscale, 
wherein the most significant CpG positions are at the bottom of the matrix with significance 
decreasing towards the top. Black indicates total methylation at a given CpG position, white 
represents no methylation at the particular position, with degrees of methylation represented 
in grey, from light (low proportion of methylation) to dark (high proportion of methylation). 
Each row represents one specific CpG position within a gene and each column shows the 
methylation profile for the different CpGs for one sample. The p-values for the individual 
CpG positions are shown on the right side. The p-values are the probabilities that the observed 
distribution occurred by chance in the data set. 

Figures 47 and 50 show the uncorrected p-values on a log-scale. P-values were calculated 
from Likelihood ratio (LR) tests from multivariate logistic regression models according to 
Example 1 (metastatic setting) . Each individual genomic region of interest is represented as a 
point, the upper dotted line represents the cut off point for the 25% false discovery rate, the 
lower dotted line shows the Bonfenoni corrected 5% limit. 

Figure 5 1 shows a ranked matrix of the best 1 1 amplificates of data obtained according to 
Example 1 (Metastatic setting, all samplews). P-values were calculated from Likelihood ratio 
(LR) tests from multivariate logistic regression models. The figure is shown in greyscale, 
wherein the most significant CpG positions are at the bottom of the matrix with significance 
decreasing towards the top. Black indicates total methylation at a given CpG position, white 
represents no methylation at the particular position, with degrees of methylation represented 
in grey, from light (low proportion of methylation) to dark (high proportion of methylation). 
Each row represents one specific CpG position within a gene and each column shows the 
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methylation profile for the different CpGs for one sample. The p-values for the individual 
CpG positions are shown on the right side. The p-values are the probabilities that the observed 
distribution occurred by chance in the data set. 

Figure 52 shows the disease-free survival curves for a combination of two oligonucleotides 
each from the genes TBC1D3 and CDK6, and one oligonucleotide from the gene PITX2. The 
black plot shows the proportion of disease free patients in the population with above median 
methylation levels, the grey plot shows the proportion of disease free patients in the popula- 
tion with below median methylation levels. 

Figure 53 shows the plot according to Figure 52 and the classification of the sample set by 
means of the St. Gallen method. The unbroken lines represent the methylation analysis 
wherein the black plot shows the proportion of disease free patients in the population with 
above median methylation levels, the grey plot shows the proportion of disease free patients 
in the population with below median methylation levels. The broken lines represent the St. 
Gallen classification of the sample set wherein the black plot shows the disease free survival 
time of the high risk group and the grey plot shows the disease free survival of the low risk 
group. 

Figure 54 shows the Kaplan-Meier estimated disease-free survival curves for a CpG position 
of the PITX2 gene by means of Real-Time methylation specific probe analysis. The lower 
plot shows the proportion of disease free patients in the population with above median meth- 
ylation levels, the upper plot shows the proportion of disease free patients in the population 
with below median methylation levels. The X axis shows the disease free survival times of the 
patients in months, and the Y- axis shows the proportion of disease free survival patients. 

Figure 55 shows the Kaplan-Meier estimated disease-free survival curves for a CpG position 
of the ERBB2 gene by means of Real-Time methylation specific probe analysis according to 
Example 2. The X axis shows the disease free survival times of the patients in years, and the 
Y- axis shows the proportion of patients with disease free survival. The black plot shows the 
proportion of disease free patients in the population with above an optimised cut off point's 
methylation levels, the grey plot shows the proportion of disease free patients in the popula- 
tion with below an optimised cut off point's methylation levels. 
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Figure 56 shows the Kaplan-Meier estimated metastasis-free survival curves for a CpG posi- 
tion of the ERBB2 gene by means of Real-Time methylation specific probe analysis accord- 
ing to Example 2. The X axis shows the disease free survival times of the patients in years, 
and the Y- axis shows the proportion of patients with disease free survival. The black plot 
shows the proportion of metastasis free patients in the population with above an optimised cut 
off point's methylation levels, the grey plot shows the proportion of disease free patients in 
the population with below an optimised cut off point's methylation levels. 

Figure 57 shows the Kaplan-Meier estimated disease-free survival curves for a CpG position 
of the TFF1 gene by means of Real-Time methylation specific probe analysis according to 
Example 2. The X axis shows the disease free survival times of the patients in years, and the 
Y- axis shows the proportion of patients with disease free survival. The black plot shows the 
proportion of disease free patients in the population with above an optimised cut off point's 
methylation levels, the grey plot shows the proportion of disease free patients in the popula- 
tion with below an optimised cut off point's methylation levels. 

Figure 58 shows the Kaplan-Meier estimated metastasis-free survival curves for a CpG posi- 
tion of the TFF1 gene by means of Real-Time methylation specific probe analysis according 
to Example 2. The X axis shows the disease free survival times of the patients in years, and 
the Y- axis shows the proportion of patients with disease free survival. The black plot shows 
the proportion of metastasis free patients in the population with above an optimised cut off 
point's methylation levels, the grey plot shows the proportion of metastasis free patients in the 
population with below an optimised cut off point's methylation levels. 

Figure 59 shows the Kaplan-Meier estimated disease-free survival curves for a CpG position 
of the PLAU gene by means of Real-Time methylation specific probe analysis according to 
Example 2. The X axis shows the disease free survival times of the patients in years, and the 
Y- axis shows the proportion of patients with disease free survival. The black plot shows the 
proportion of disease free patients in the population with above an optimised cut off point's 
methylation levels, the grey plot shows the proportion of disease free patients in the popula- 
tion with below an optimised cut off point's methylation levels. 

Figure 60 shows the Kaplan-Meier estimated metastasis-free survival curves for a CpG posi- 
tion of the PLAU gene by means of Real-Time methylation specific probe analysis according 
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to Example 2. The X axis shows the disease free survivai times of the patients in years, and 
the Y- axis shows the proportion of patients with metastasis free survival. The black plot 
shows the proportion of disease free patients in the population with above an optimised cut 
off point' s methylation levels, the grey plot shows the proportion of metastasis free patients in 
the population with below an optimised cut off point's methylation levels. 

Figure 6 1 shows the Kaplan-Meier estimated disease-free survival curves for a CpG position 
of the PITX2 gene by means of Real-Time methylation specific probe analysis according to 
Example 2. The X axis shows the disease free survival times of the patients in years, and the 
Y- axis shows the proportion of patients with disease free survival. The black plot shows the 
proportion of disease free patients in the population with above an optimised cut off point' s 
methylation levels, the grey plot shows the proportion of disease free patients in the popula- 
tion with below an optimised cut off point's methylation levels. 

Figure 62 shows the Kaplan-Meier estimated metastasis-free survival curves for a CpG posi- 
tion of the PITX2 gene by means of Real-Time methylation specific probe analysis according 
to Example 2. The X axis shows the disease free survival times of the patients in years, and 
the Y- axis shows the proportion of patients with metastasis free survival. The black plot 
shows the proportion of metastasis free patients in the population with above an optimised cut 
off point' s methylation levels, the grey plot shows the proportion of disease free patients in 
the population with below an optimised cut off point's methylation levels. 

Figure 63 shows the Kaplan-Meier estimated disease-free survival curves for a CpG position 
of the TBC1D3 gene by means of Real-Time methylation specific probe analysis according to 
Example 2. The X axis shows the disease free survival times of the patients in years, and the 
Y- axis shows the proportion of patients with disease free survival. The black plot shows the 
proportion of disease free patients in the population with above an optimised cut off point's 
methylation levels, the grey plot shows the proportion of disease free patients in the popula- 
tion with below an optimised cut off point's methylation levels. 

Figure 64 shows the Kaplan-Meier estimated metastasis-free survival curves for a CpG posi- 
tion of the TBC1D3 gene by means of Real-Time methylation specific probe analysis ac- 
cording to Example 2. The X axis shows the disease free survival times of the patients in 
years, and the Y- axis shows the proportion of patients with metastasis free sunnval. The 
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black plot shows the proportion of metastasis free patients in the population with above an 
optimised cut off point's methylation levels, the grey plot shows the proportion of disease free 
patients in the population with below an optimised cut off point's methylation levels. 

Figure 65 shows the Kaplan-Meier estimated disease-free survival curves for a CpG position 
of the ERBB2 gene by means of Real-Time methylation specific probe analysis according to 
Example 2. The X axis shows the disease free survival times of the patients in years, and the 
Y- axis shows the proportion of patients with disease free survival. The black plot shows the 
proportion of disease free patients in the population with above an optimised cut off point's 
methylation levels, the grey plot shows the proportion of disease free patients in the popula- 
tion with below an optimised cut off point's methylation levels. 

Figure 66 shows the Kaplan-Meier estimated metastasis-free survival curves for a CpG posi- 
tion of the ERBB2 gene by means of Real-Time methylation specific probe analysis accord- 
ing to Example 2. The X axis shows the disease free survival times of the patients in years, 
and the Y- axis shows the proportion of patients with metastasis free survival. The black plot 
shows the proportion of metastasis free patients in the population with above an optimised cut 
off point's methylation levels, the grey plot shows the proportion of disease free patients in 
the population with below an optimised cut off point's methylation levels. 

Figure 67 shows the Kaplan-Meier estimated disease-free survival curves for a CpG position 
of the TFF1 gene by means of Real-Time methylation specific probe analysis according to 
Example 2. The X axis shows the disease free survival times of the patients in years, and the 
Y- axis shows the proportion of patients with disease free survival The black plot shows the 
proportion of disease free patients in the population with above an optimised cut off point's 
methylation levels, the grey plot shows the proportion of disease free patients in the popula- 
tion with below an optimised cut off point's methylation levels. 

Figure 68 shows the Kaplan-Meier estimated metastasis-free survival curves for a CpG posi- 
tion of the TFF1 gene by means of Real-Time methylation specific probe analysis according 
to Example 2. The X axis shows the disease free survival times of the patients in years, and 
the Y- axis shows the proportion of patients with disease free survival. The black plot shows 
the proportion of metastasis free patients in the population with above an optimised cut off 
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point's methylation levels, the grey plot shows the proportion of metastasis free patients in the 
population with below an optimised cut off point's methylation levels. 

Figure 69 shows the Kaplan-Meier estimated disease-free survival curves for a CpG position 
of the PLAU gene by means of Real-Time methylation specific probe analysis according to 
Example 2. The X axis shows the disease free survival times of the patients in years, and the 
Y- axis shows the proportion of patients with disease free survival. The black plot shows the 
proportion of disease free patients in the population with above an optimised cut off point' s 
methylation levels, the grey plot shows the proportion of disease free patients in the popula- 
tion with below an optimised cut off point's methylation levels. 

Figure 70 shows the Kaplan-Meier estimated metastasis-free survival curves for a CpG po- 
sition of the PLAU gene by means of Real-Time methylation specific probe analysis accord- 
ing to Example 2, The X axis shows the disease free survival times of the patients in years, 
and the Y- axis shows the proportion of patients with metastasis free survival. The black plot 
shows the proportion of metastasis free patients in the population with above an optimised cut 
off point's methylation levels, the grey plot shows the proportion of disease free patients in 
the population with below an optimised cut off point's methylation levels. 

Figure 71 shows the Kaplan-Meier estimated disease-free survival curves for a CpG position 
of the PITX2 gene by means of Real-Time methylation specific probe analysis according to 
Example 2. The X axis shows the disease free survival times of the patients in years, and the 
Y- axis shows the proportion of patients with disease free survival. The black plot shows the 
proportion of disease free patients in the population with above an optimised cut off point's 
methylation levels, the grey plot shows the proportion of disease free patients in the popula- 
tion with below an optimised cut off point's methylation levels , 

Figure 72 shows the Kaplan-Meier estimated metastasis-free survival curves for a CpG po- 
sition of the PITX2 gene by means of Real-Time methylation specific probe analysis ac- 
cording to Example 2. The X axis shows the disease free survival times of the patients in 
years, and the Y- axis shows the proportion of patients with metastasis free survival. The 
black plot shows the proportion of disease free patients in the population with above an opti- 
mised cut off point's methylation levels, the grey plot shows the proportion of metastasis free 
patients in the population with below an optimised cut off point's methylation levels. 
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Figure 73 shows the Kaplan-Meier estimated disease-free survival curves for a CpG position 
of the PITX2 gene by means of Real-Time methylation specific probe analysis according to 
Example 2. The X axis shows the disease free survival times of the patients in years, and the 
Y- axis shows the proportion of patients with disease free survival. The black plot shows the 
proportion of disease free patients in the population with above an optimised cut off point's 
methylation levels, the grey plot shows the proportion of disease free patients in the popula- 
tion with below an optimised cut off point's methylation levels. 

Figure 74 shows the Kaplan-Meier estimated metastasis-free survival curves for a CpG posi- 
tion of the PITX2 gene by means of Real-Time methylation specific probe analysis according 
to Example 2. The X axis shows the disease free survival times of the patients in years, and 
the Y- axis shows the proportion of patients with metastasis free survival. The black plot 
shows the proportion of metastasis free patients in the population with above an optimised cut 
off point's methylation levels, the grey plot shows the proportion of disease free patients in 
the population with below an optimised cut off point's methylation levels. 

Figure 75 shows the Kaplan-Meier estimated disease-free survival curves for a CpG position 
of the ONECUT2 gene by means of Real-Time methylation specific probe analysis according 
to Example 2. The X axis shows the disease free survival times of the patients in years, and 
the Y- axis shows the proportion of patients with disease free survival. The black plot shows 
the proportion of disease free patients in the population with above an optimised cut off 
point's methylation levels, the grey plot shows the proportion of disease free patients in the 
population with below an optimised cut off point's methylation levels. 

Figure 76 shows the Kaplan-Meier estimated metastasis-free survival curves for a CpG posi- 
tion of the ONECUT2 gene by means of Real-Time methylation specific probe analysis ac- 
cording to Example 2. The X axis shows the metastasis free survival times of the patients in 
years, and the Y- axis shows the proportion of patients with metastasis free survival. The 
black plot shows the proportion of disease free patients in the population with above an opti- 
mised cut off point's methylation levels, the grey plot shows the proportion of disease free 
pationts in the population with below an optimised cut off point's methylation levels. 
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Figure 77 shows the Kaplan-Meier estimated disease-free survival curves for a CpG position 
of the TBC1D3 gene by means of Real-Time methylation specific probe analysis according to 
Example 2. The X axis shows the disease free survival times of the patients in years, and the 
Y- axis shows the proportion of patients with disease free survival. The black plot shows the 
proportion of disease free patients in the population with above an optimised cut off point's 
methylation levels, the grey plot shows the proportion of disease free patients in the popula- 
tion with below an optimised cut off point's methylation levels. 

Figure 78 shows the Kaplan-Meier estimated metastasis-free survival curves for a CpG posi- 
tion of the TBC1D3 gene by means of Real-Time methylation specific probe analysis ac- 
cording to Example 2. The X axis shows the metastasis free survival times of the patients in 
years, and the Y- axis shows the proportion of patients with metastasis free survival. The 
black plot shows the proportion of disease free patients in the population with above an opti- 
mised cut off point' s methylation levels, the grey plot shows the proportion of disease free 
patients in the population with below an optimised cut off point's methylation levels. 

Figure 79 shows the Kaplan-Meier estimated disease-free survival curves for a CpG position 
of the ABCA8 gene by means of Real-Time methylation specific probe analysis according to 
Example 2. The X axis shows the disease free survival times of the patients in years, and the 
Y- axis shows the proportion of patients with disease free survival. The black plot shows the 
proportion of disease free patients in the population with above an optimised cut off point' s 
methylation levels, the grey plot shows the proportion of disease free patients in the popula- 
tion with below an optimised cut off point's methylation levels. 

Figure 80 shows the Kaplan-Meier estimated metastasis-free survival curves for a CpG posi- 
tion of the ABCA8 gene by means of Real-Time methylation specific probe analysis accord- 
ing to Example 2. The X axis shows the disease free survival times of the patients in years, 
and the Y- axis shows the proportion of patients with metastasis free survival. The black plot 
shows the proportion of metastasis free patients in the population with above an optimised cut 
off point's methylation levels, the grey plot shows the proportion of disease free patients in 
the population with below an optimised cut off point's methylation levels. 

Figure 8 1 shows the Kaplan-Meier estimated disease-free survival curves for a CpG position 
of a combination of the TFF1 (SEQ ID NO; 12) and PLAU (SEQ ID NO: 16) genes by means 
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of Real-Time methylation specific probe analysis according to Example 2. The X axis shows 
the disease free survival times of the patients in years, and the Y- axis shows the proportion of 
patients with disease free survival. The black plot shows the proportion of disease free pa- 
tients in the population with above an optimised cut off point's methylation levels, the grey 
plot shows the proportion of disease free patients in the population with below an optimised 
cut off point's methylation levels. 

Figure 82 shows the Kaplan-Meier estimated metastasis-free survival curves for a CpG posi- 
tion of a combination of the TFF1 (SEQ ID NO: 12) and PLAU (SEQ ID NO:16) genes by 
means of Real-Time methylation specific probe analysis according to Example 2, The X axis 
shows the metastasis free survival times of the patients in years, and the Y- axis shows the 
proportion of patients with metastasis free survival. The black plot shows the proportion of 
disease free patients in the population with above an optimised cut off point's methylation 
levels, the grey plot shows the proportion of disease free patients in the population with below 
an optimised cut off point's methylation levels. 

Figure 83 shows the Kaplan-Meier estimated disease-free survival curves for a CpG position 
of a combination of the TFF1 (SEQ ID NO: 12) and PLAU (SEQ ID NO:16) and PITX2 
(SEQ ID NO:23) genes by means of Real-Time methylation specific probe analysis according 
to Example 2. The X axis shows the disease free survival times of the patients in years, and 
the Y- axis shows the proportion of patients with disease free survival. The black plot shows 
the proportion of disease free patients in the population with above an optimised cut off 
point's methylation levels, the grey plot shows the proportion of disease free patients in the 
population with below an optimised cut off point's methylation levels. 

Figure 84 shows the Kaplan-Meier estimated metastasis-free survival curves for a CpG posi- 
tion of a combination of the TFF1 (SEQ ID NO: 12) and PLAU (SEQ ID NO:16) and PITX2 
(SEQ ID NO:23) genes by means of Real-Time methylation specific probe analysis according 
to Example 2. The X axis shows the disease free survival times of the patients in years, and 
the Y- axis shows the proportion of patients with metastasis free survival. The black plot 
shows the proportion of metastasis free patients in the population with above an optimised cut 
off point's methylation levels, the grey plot shows the proportion of disease free patients in 
the population with below an optimised cut off point's methylation levels. 
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Figure 85 shows the Kaplan-Meier estimated disease-free survival curves for a CpG position 
of a combination of the PITX2 (SEQ ID NO:23) and TFF1 (SEQ ID NO: 12) genes by means 
of Real-Time methylation specific probe analysis according to Example 2. The X axis shows 
the disease free survival times of the patients in years, and the Y- axis shows the proportion of 
patients with disease free survival. The black plot shows the proportion of disease free pa- 
tients in the population with above an optimised cut off point 5 s methylation levels, the grey 
plot shows the proportion of disease free patients in the population with below an optimised 
cut off point's methylation levels. 

Figure 86 shows the Kaplan-Meier estimated metastasis-free survival curves for a CpG posi- 
tion a combination of the PITX2 (SEQ ID NO:23) and TFF1 (SEQ ID NO: 12) genes by 
means of Real-Time methylation specific probe analysis according to Example 2. The X axis 
shows the metastasis free survival times of the patients in years, and the Y- axis shows the 
proportion of patients with metastasis free survival. The black plot shows the proportion of 
disease free patients in the population with above an optimised cut off point's methylation 
levels, the grey plot shows the proportion of disease free patients in the population with below 
an optimised cut off point's methylation levels. 

Figure 87 shows the Kaplan-Meier estimated disease-free survival curves for a CpG position 
of a combination of the PITX2 (SEQ ID NO:23) and PLAU (SEQ ID NO: 16) genes by means 
of Real-Time methylation specific probe analysis according to Example 2. The X axis shows 
the disease free survival times of the patients in years, and the Y- axis shows the proportion of 
patients with disease free survival. The black plot shows the proportion of disease free pa- 
tients in the population with above an optimised cut off point's methylation levels, the grey 
plot shows the proportion of disease free patients in the population with below an optimised 
cut off point's methylation levels. 

Figure 88 shows the Kaplan-Meier estimated metastasis-free survival curves for a CpG posi- 
tion of a combination of the PITX2 (SEQ ID NO:23) and PLAU (SEQ ID NO:l6) genes by 
means of Real-Time methylation specific probe analysis according to Example 2. The X axis 
shows the metastasis free survival times of the patients in years, and the Y- axis shows the 
proportion of patients with metastasis free survival. The black plot shows the proportion of 
disease free patients in the population with above an optimised cut off point's methylation 
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levels, the grey plot shows the proportion of disease free patients in the population with below 
an optimised cut off point's methylation levels. 

Figure 89 shows the Kaplan-Meier estimated disease-free survival curves for a CpG position 
of a combination of the TFF1 (SEQ ID NO: 12) and PLAU (SEQ ID NO: 16) genes by means 
of Real-Time methylation specific probe analysis according to Example 2. The X axis shows 
the disease free survival times of the patients in years, and the Y- axis shows the proportion of 
patients with disease free survival. The black plot shows the proportion of disease free pa- 
tients in the population with above an optimised cut off point's methylation levels, the grey 
plot shows the proportion of disease free patients in the population with below an optimised 
cut off point's methylation levels. 

Figure 90 shows the Kaplan-Meier estimated metastasis-free survival curves for a CpG posi- 
tion of a combination of the TFF1 (SEQ ID NO: 12) and PLAU (SEQ ID NO:16) genes by 
means of Real-Time methylation specific probe analysis according to Example 2. The X axis 
shows the disease free survival times of the patients in years, and the Y- axis shows the pro- 
portion of patients with metastasis free survival. The black plot shows the proportion of me- 
tastasis free patients in the population with above an optimised cut off point's methylation 
levels, the grey plot shows the proportion of disease free patients in the population with below 
an optimised cut off point's methylation levels. 

Figure 91 shows the Kaplan-Meier estimated disease-free survival curves for a CpG position 
of a combination of the TFF1 (SEQ ID NO: 12) and PLAU (SEQ ID NO:16) and PITX2 
(SEQ ID NO:23) genes by means of Real-Time methylation specific probe analysis according 
to Example 2. The X axis shows the disease free survival times of the patients in years, and 
the Y- axis shows the proportion of patients with disease free survival. The black plot shows 
the proportion of disease free patients in the population with above an optimised cut off 
point's methylation levels, the grey plot shows the proportion of disease free patients in the 
population with below an optimised cut off point's methylation levels. 

Figure 92 shows the Kaplan-Meier estimated metastasis-free survival curves for a CpG posi- 
tion of a combination of the TFF1 (SEQ ID NO: 12) and PLAU (SEQ ID NO: 16) and PITX2 
(SEQ ID NO:23) genes by means of Real-Time methylation specific probe analysis according 
to Example 2. The X axis shows the metastasis free survival times of the patients in years, and 
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the Y- axis shows the proportion of patients with metastasis free survival. The black plot 
shows the proportion of disease free patients in the population with above an optimised cut 
off point's methylation levels, the grey plot shows the proportion of disease free patients in 
the population with below an optimised cut off point's methylation levels. 

Figure 93 shows the Kaplan-Meier estimated disease-free survival curves for a CpG position 
of a combination of the PITX2 (SEQ ID NO:23) and TFF1 (SEQ ID NO: 12) genes by 
means of Real-Time methylation specific probe analysis according to Example 2. The X axis 
shows the disease free survival times of the patients in years, and the Y- axis shows the pro- 
portion of patients with disease free survival. The black plot shows the proportion of disease 
free patients in the population with above an optimised cut off point's methylation levels, the 
grey plot shows the proportion of disease free patients in the population with below an opti- 
mised cut off point's methylation levels. 

Figure 94 shows the Kaplan-Meier estimated metastasis-free survival curves for a CpG posi- 
tion of a combination of the PITX2 (SEQ ID NO:23) and TFF1 (SEQ ID NO: 12) genes by 
means of Real-Time methylation specific probe analysis according to Example 2. The X axis 
shows the metastasis free survival times of the patients in years, and the Y- axis shows the 
proportion of patients with metastasis free survival. The black plot shows the proportion of 
disease free patients in the population with above an optimised cut off point's methylation 
levels, the grey plot shows the proportion of disease free patients in the population with below 
an optimised cut off point's methylation levels. 

Figure 95 shows the Kaplan-Meier estimated disease-free survival curves for a CpG position 
of a combination of the PITX2 (SEQ ID NO:23) and PLAU (SEQ ID NO:16) genes by 
means of Real-Time methylation specific probe analysis according to Example 2. The X axis 
shows the disease free survival times of the patients in years, and the Y- axis shows the pro- 
portion of patients with disease free survival. The black plot shows the proportion of disease 
free patients in the population with above an optimised cut off point's methylation levels, the 
grey plot shows the proportion of disease free patients in the population with below an opti- 
mised cut off point's methylation levels. 

Figure 96 shows the Kaplan-Meier estimated metastasis-free survival curves for a CpG posi- 
tion of a combination of the PITX2 (SEQ ID NO;23) and PLAU (SEQ ID NO: 16) genes by 
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means of Real-Time methylation specific probe analysis according to Example 2. The X axis 
shows the disease free survival times of the patients in years, and the Y- axis shows the pro- 
portion of patients with metastasis free survival. The black plot shows the proportion of me- 
tastasis free patients in the population with above an optimised cut off point's methylation 
levels, the grey plot shows the proportion of disease free patients in the population with below 
an optimised cut off point's methylation levels. 

Figure 97 shows a scatter plot of matched pair PET and fresh frozen tissues analysed using 
PITX2 gene assay 1 according to Example 2. Quantitative methylation CT scores of PET 
samples are shown on the Y-axis, and quantitative methylation CT scores of fresh frozen 
samples are shown on the X-axis. The association between the paired samples is 0.81 (Spear- 
man's rho). This analysis is based on n=89 samples. 

Figure 98 shows the Disease free survival (DFS) of randomly selected ER+, NO, untreated 
patient population in Kaplan-Meier survival plot according to Example 2. Proportion of dis- 
ease free patients is shown on the Y-axis and time in years is shown on the X-axis. 139 events 
were observed (observed event rate=33%). Disease free survival after 5 years: 74.5% [70.3%, 
78.9%], after 10 years 59.8% [54.2%, 66%]. 95% confidence intervals are plotted. 

Figure 99 shows the distribution of follow-up times in ER+, NO, untreated population ac- 
cording to Example 2. Frequency is shown on the Y-axis and time in months is shown on the 
X-axis. The figure on the left shows patients with event (all kinds of relapses). Mean follow- 
up time 45.8 months (standard deviation=31), median=38 (range=[2, 123]). 
The figure on the right shows censored patients. Mean follow up time 93 months (standard 
deviation=35.6), median=94 (range=[l, 190]). 

Figure 100 shows the Disease free survival (DFS) of ER+, NO, TAM treated population in 
Kaplan-Meier plot according to Example 2. Proportion of disease free patients is shown on 
the Y-axis and time in years is shown on the X-axis. 56 events were observed (observed event 
rate=10 %). DFS after 5 years: 92.4% [90%, 94.9%], after 10 years: 82.1% [77.3%,87.2%]. 
95% confidence intervals are plotted. 

Figure 101 shows the distribution of follow-up times in ERH-, NO, untreated population ac- 
cording to Example 2. Frequency is shown on the Y-axis and time in months is shown on the 
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X-axis. The figure on the left shows patients with all events (all kinds of relapses). Mean fol- 
low-up time 47.9 months (standard deviation=24.4), median=45 (range=[2, 98]). 
The figure on the right shows censored patients. Mean follow up time 65.3 months (standard 
deviation=3 1 .6), median=64 (range=[0, 158]). 

Figure 102 shows the ROC plot at different times for marker model 3522 (Assay 1) and 2265 
on ER+NO TAM treated population according to Example 2. Figure A shows the plot at 60 
months, figure B shows the plot at 72 months, figure C shows the plot at 84 months and figure 
D shows the plot at 96 months. Only distant metastasis are defined as events. Sensitivity (pro- 
portion of all relapsed patients in poor prognostic group) shown on the X-axis and specificity 
(proportion of all relapse free patients in good prognostic group) shown on the Y-axis are cal- 
culated from KM estimates, and the estimated area under the curve (AUC) is calculated. Val- 
ues for median cut off (triangle) and best cut off (diamond, 0.32 quantile) are plotted. 

Figure 103 shows the ROC plot at different times for marker model 3522 (Assay 1) alone on 
ER+N0 TAM treated population according to Example 2. Figure A shows the plot at 60 
months, figure B shows the plot at 72 months, figure C shows the plot at 84 months and figure 
D shows the plot at 96 months. Only distant metastasis are defined as events. Sensitivity (pro- 
portion of all relapsed patients in poor prognostic group) shown on the X-axis and specificity 
(proportion of all relapse free patients in good prognostic group) shown on the Y-axis are cal- 
culated from KM estimates, and the estimated area under the curve (AUC) is calculated. Val- 
ues for median cut off (triangle) and best cut off (diamond, 0.42 quantile) are plotted. 

Figure 104 shows the ROC plot at different times for marker model 2265 on ER+N0 TAM 
treated population according to Example 2. Figure A shows the plot at 60 months, figure B 
shows the plot at 72 months, figure C shows the plot at 84 months and figure D shows the plot 
at 96 months. Only distant metastasis are defined as events. Sensitivity (proportion of all re- 
lapsed patients in poor prognostic group) shown on the X-axis and specificity (proportion of 
all relapse free patients in good prognostic group) shown on the Y-axis are calculated from 
KM estimates for different thresholds (= 5, 6, 7 , 8 years) and the estimated area under the 
curve (AUC) is calculated. Values for median cut off (triangle) and best cut off (diamond, 
0.78 quantile) are plotted. 
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Figure 105 shows the ROC plot at different times for marker model 2395 on ER+N0 TAM 
treated population according to Example 2. Figure A shows the plot at 60 months, figure B 
shows the plot at 72 months, figure C shows the plot at 84 months and figure D shows the plot 
at 96 months. Only distant metastasis are defined as events. Sensitivity (proportion of all re- 
lapsed patients in poor prognostic group) shown on the X-axis and specificity (proportion of 
all relapse free patients in good prognostic group) shown on the Y-axis are calculated from 
KM estimates for different thresholds (= 5, 6, 7 , 8 years), and the estimated area under the 
curve (AUC) is calculated. Values for median cut off (triangle) and best cut off (diamond, 
0.77 quantile) are plotted. 

SEQ ID NOS: 1 to 61 and 149 to 150 represent 5' and/or regulatory regions and/or CpG rich 
regions of the genes according to Table 1 . These sequences are derived from Genbank and 
will be taken to include all minor variations of the sequence material which are currently un- 
foreseen, for example, but not limited to, minor deletions and SNPs. 

Example 1 

DNA samples were extracted using the Wizzard Kit (Promega), samples from 278 patients 
were analysed, data analyses were carried out on a selection of candidate markers. 

Bisulfite treatment and mPCR 

Total genomic DNA of all samples was bisulfite treated converting unmethylated cytosines to 
uracil. Methylated cytosines remained conserved. Bisulfite treatment was performed with 
minor modifications according to the protocol described in Olek et al. (1996). After bisulfita- 
tion 10 ng of each DNA sample was used in subsequent mPCR reactions containing 6-8 
primer pairs. 

Each reaction contained the following: 
2.5 pmol each primer 
1 1 .25 ng DNA (bisulfite treated) 
Multiplex PGR Master mix (Qiagen) 

Further details of the primers are shown in TABLE 2. 
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Initial denaturation was carried out at 95°C for 15 min. Forty cycles were carried out as fol- 
lows: Denaturation at 95°C for 30 sec, followed by annealing at 57°C for 90 sec, primer 
elongation at 72°C for 90 sec. A final elongation at 72°C was carried out for 10 min. 

Hybridisation 

All PGR products from each individual sample were then hybridised to glass slides carrying a 
pair of immobilised oligonucleotides for each CpG position under analysis. Each of these de- 
tection oligonucleotides was designed to hybridise to the bisulphite converted sequence 
around one CpG site which was either originally unmethylated (TG) or methylated (CG). See 
Table 2 for further details of hybridisation oligonucleotides used. Hybridisation conditions 
were selected to allow the detection of the single nucleotide differences between the TG and 
CG variants. 

5 \il volume of each multiplex PGR product was diluted in 10 x Ssarc buffer . The reaction 
mixture was then hybridised to the detection oligonucleotides as follows. Denaturation at 
95°C, cooling down to 10 °C, hybridisation at 42°C overnight followed by washing with 10 x 
Ssarc and dH20 at 42°C. Further details of the hybridisation oligonucleotides are shown in 
TABLE 3. 

Fluorescent signals from each hybridised oligonucleotide were detected using genepix scan- 
ner and software. Ratios for the two signals (from the CG oligonucleotide and the TG oligo- 
nucleotide used to analyse each CpG position) were calculated based on comparison of inten- 
sity of the fluorescent signals. 

Data analysis methods 

Analysis of the chip data: From raw hybridisation intensities to methylation ratios; The log 
methylation ratio (log(CG/TG)) at each CpG position is determined according to a standard- 
ised preprocessing pipeline that includes the following steps: For each spot the median back- 
ground pixel intensity is subtracted from the median foreground pixel intensity (this gives a 
good estimate of background corrected hybridisation intensities): For both CG and TG detec- 
tion oligonucleotides of each CpG position the background corrected median of the 4 redun- 
dant spot intensities is taken; For each chip and each CpG position the log(CG/TG) ratio is 
calculated; For each sample the median of log(CG/TG) intensities over the redundant chip 
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repetitions is taken. This ratio has the property that the hybridisation noise has approximately 
constant variance over the full range of possible methylation rates (Huber et al. ? 2002). 

Hypothesis testing 

The main task is to identify markers that show significant differences in the average degree of 
methylation between two classes. A significant difference is detected when the nullhypothesis 
that the average methylation of the two classes is identical can be rejected with p<0.05. Be- 
cause we apply this test to a whole set of potential markers we have to correct the p-values for 
multiple testing. This was done by applying the False Discovery Rate (FDR) method (Dudoit 
et al., 2002). 

For testing the null hypothesis that the methylation levels in the two classes are identical we 
used the likelihood ratio test for logistic regression models (Venables and Ripley, 2002). The 
logistic regression model for a single marker is a linear combination of methylation measure- 
ments from all CpG positions in the respective genomic region of interest (ROI). A significant 
p-value for a marker means that this ROI has some systematic correlation to the question of 
interest as given by the two classes. However, at least formally it makes no statement about 
the actual predictive power of the marker. 

Logistic Regression 

Logistic regression models are tools to model the probability of an event in dependence of 
one or more variables or factors. For example, if x denotes a specific methylation log ratio, 
the probability that a patient responds to the applied therapy (Tamoxifen) is modeled as 

P(response | x) = exp(a + jBx)/[l + exp(a-\- (1) 

If jci,. . *yX k denote the k methylation logratios measured for one amplificate, the model is 

P(response | Xu . • - JCk) = exp(a + fl\X\ + . . . + /te)/[l + exp(a + fi\X\ + . . . + faxfc)] . (2) 

Significance of the respective amplificate is assessed using a likelihood-ratio test. This test 
calculates the difference of -2Log(likelihood) for the full model and the null-model including 
just the intercept a which is approximately ^-distributed with k degrees of freedom under the 
null hypotheses fi\ = . . . — fik = 0. 
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If additional covariates are considered, the model contains an additional parameter for each 
covariate and the test statistic is calculated as the difference of -2Log(likelihood) or the full 
model and the null-model including intercept and covariates. Again, given the null hypothe- 
sis, this difference is approximately ^-distributed with k degrees of freedom. 

Ranked Matrices 

For a graphical display of all group comparisons, ranked matrices are used. Each row repre- 
sents one oligo pair, whereas each column of the matrix stands for one sample (or chip in the 
case of up- versus downmethylated Promega DNA comparisons). Oligo pairs are ranked ac- 
cording to their discriminatory power (Wilcoxon test, Fisher score or logistic regression), 
where the best "marker" is displayed on the bottom line. Low methylation is displayed in light 
grey, high methylation in dark grey, and the data are normalized prior to display. 

Cox Regression 

Disease-free survival times (DFS) are modeled using Cox regression models. These models 
are similar to logistic regression models, but instead of probabilities, the hazard is modeled. 
The hazard gives the instantaneous risk of a relapse. The models 

h(t | x) = h 0 (f)-exp(J3x) (3) 
and 

h(t | x u . . .pck) = h 0 {tyexpiPxXx + ...+ J8 k x k ) (4) 

are used for uni- and multivariate analyses, respectively, where t is the time measured in 
months after surgery and h0(t) is the baseline hazard. Likelihood ratio tests are performed 
similar to those used for logistic regression. Again, the difference between g£Ikelihood) 
of full model and null-model is approximately □2-distributed with k degrees of freedom un- 
der the null hypotheses Dl = ... = Ok= 0. Additional covariates can be included into the 
models. 

Stepwise Regression Analysis 

For both multivariate logistic and Cox regression models, a stepwise procedure is used in or- 
der to find submodels including only relevant variables. Two effects are usually achieved by 
these procedures: Variables (methylation ratios) that are basically unrelated to the dependent 
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variable (response state or DFS, respectively) are excluded as they do not add relevant infor- 
mation to the model. Out of a set of highly correlated variables, only the one with the the best 
relation to the dependent variable is retained. Inclusion of both types of variables can lead to 
numerical instabilities and a loss of power. Moreover, the predictory performance can be low 
due to overfitting. The applied algorithm aims at minimizing the Akaike information criterion 
(AIC) which is defined as 

AIC = -2-maximized log-likelihood + 2-#parameters. 

The AIC is related to the predictory performance of a model, smaller values promise better 
performance. Whereas the inclusion of additional variables always improves the model fit and 
thus increases the likelihood, the second term penalizes the estimation of additional parame- 
ters. The best model will present a compromise model with good fit and usually a small or 
moderate number of variables. 

Results 

Adjuvant setting 

Analysis of the methylation patterns of patient samples treated with Tamoxifen as an adjuvant 
therapy immediately following surgery (see Figure 1) is shown in the plots according to Fig- 
ures 3 to 45. For each amplificate, the mean methylation over all oligo-pairs for that amplifi- 
cate was calculated and the population split into groups according to their mean methylation 
values, wherein one group was composed of individuals with a methylation score higher than 
the median and a second group composed of individuals with a methylation score lower than 
the median. 

The results are shown in figures 3 to 9 , as Cox model estimated disease-free survival curves. 
Figures 10 to 34 show the disease free survival curves using the methylation analyses of only 
single oligonucleotide. 

In a further analysis the recurrence of distant metastases only was analysed in figures 35 to 
46. 

The accuracy of the differentiation between the different groups was further increased by 
combining multiple oligonucleotides from different genes. Figure 53 shows the combination 
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of two oligonucleotides each from the genes TBC1D3 and CDK6, and one oligonucleotide 
from the gene PITX2. the broken lines show the classification of the patients from the sample 
set by means of the St. Gallen method (the current method of choice for estimating disease 
free survival) as compared to methylation analysis (unbroken lines), thereby showing the im- 
proved effectiveness of methylation analysis over current methods, in particular post 80 
months. The St. Gallen method is the most commonly used treatment selection criteria for 
breast cancer patients. The criteria are revised every two years, and are based upon clinical 
factors (age, type of cancer, size, metastasis etc.), it is used to divide patients into high risk 
and low risk cases which follow different rules for therapy. 

Metastatic setting 

Analysis of the methylation patterns of patient samples treated with Tamoxifen in a metastatic 
setting (see Figure 2) is shown in the matrices according to Figures 46 to 52) . The subjects 
analysed in this classification had relapsed following an initial treatment, the subsequent me- 
tastasis being treated by Tamoxifen. 

In order to determine the ability of each gene promoter to predict success or failure of Tam- 
oxifen treatment, the individual CpGs measured were combined per gene using Hotelling's T 2 
statistics . Several genes were significantly associated with response to tamoxifen after cor- 
recting for multiple comparison with a moderate conservative false discovery rate of 25% (see 
Figure 52). The genes were ONECUT2, WBP11, CYP2D6, DAG1, ERBB2, S100A2, TFF1, 
TP53, TMEFF2, ESR1, SYK, RASSF1, PITX2, PSAT1, CGA and PCAF. 

Figure 50 shows the uncorrected p-values on a log-scale. P-values were calculated from Like- 
lihood ratio (LR) tests from multivariate logistic regression models. Each individual genomic 
region of interest is represented as a point, the upper dotted line represents the cut off point 
for the 25% false discovery rate, the lower dotted line shows the Bonferroni corrected 5% 
limit. 

Figure 5 1 shows a ranked matrix of the best 1 1 amplificates of data obtained . P-values were 
calculated from Likelihood ratio (LR) tests from multivariate logistic regression models. The 
figure is shown in greyscale, wherein the most significant CpG positions are at the bottom of 
the matrix with significance decreasing towards the top. Black indicates total methylation at a 
given CpG position, white represents no methylation at the particular position, with degrees 
of methylation represented in grey, from light (low proportion of methylation) to dark (high 
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proportion of methylation). Each row represents one specific CpG position within a gene and 
each column shows the methylation profile for the different CpGs for one sample. The pa- 
yables for the individual CpG positions are shown on the right side. The p-values are the 
probabilities that the observed distribution occurred by chance in the data set. 

Figures 47 through 49 the analysis of a subset of shows the uncorrected p-values on a log- 
scale. Figure 47 shows the uncorrected p-values on a log-scale. P-values were calculated from 
Likelihood ratio (LR) tests from multivariate logistic regression models according to Example 
1 (metastatic setting) . Each individual genomic region of interest is represented as a point, 
the upper dotted line represents the cut off point for the 25% false discovery rate, the lower 
dotted line shows the Bonferroni corrected 5% limit. 

Figure 48 shows a ranked matrix of the best 1 1 amplificates of data obtained. P-values were 
calculated from Likelihood ratio (LR) tests from multivariate logistic regression models. The 
figure is shown in greyscale, wherein the most significant CpG positions are at the bottom of 
the matrix with significance decreasing towards the top. Black indicates total methylation at a 
given CpG position, white represents no methylation at the particular position, with degrees 
of methylation represented in grey, from light (low proportion of methylation) to dark (high 
proportion of methylation). Each row represents one specific CpG position within a gene and 
each column shows the methylation profile for the different CpGs for one sample. The p- 
values for the individual CpG positions are shown on the right side. The p-values are the 
probabilities that the observed distribution occurred by chance in the data set. 

Real time Quantitative methylation analysis 

Genomic DNA was analyzed using the Real Time PCR technique after bisulfite conversion In 
this analysis four oligonucleotides were used in each reaction. Two non methylation specific 
PCR primers were used to amplify a segment of the treated genomic DNA containing a meth- 
ylation variable oligonucleotide probe binding site. Two oligonucleotide probes competitively 
hybridise to the binding site, one specific for the methylated verison of the binding site, the 
other specific to the unmethlyated version of the binding site. Accordingly, one of the probes 
comprises a CpG at the methylation variable position (i.e. anneals to methylated bisulphite 
treated sites) and the other comprises a TpG at said positon (i.e. anneals to unmethylated bi- 
sulphite treated sites). Each species of probe is labelled with a 5* fluorescent reporter dye and 
a 3 y quencher dye wherein the CpG and TpG oligonucleotides are labelled with different dyes. 
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The reactions are calibrated by reference to DNA standards of known methylation levels in 
order to quantify the levels of methlyation within the sample. The DNA standards were com- 
posed of bisulfite treated phi29 amplified genomic DNA (i.e. unmethlyated), and/or phi29 
amplified genomic DNA treated with Sssl Methylase enzyme (thereby methylating each CpG 
postion in the sample), which is then treated with bisulfite solution. Seven different reference 
standards were used with 0%, (i.e. phi29 amplified genomic DNA only), 5%, 10%, 25%, 
50%, 75% and 100% (i.e. phi29 Sssl treated genomic only). 

The amount of sample DNA amplified is quantified by reference to the gene (fi-actin (ACTB)) 
to normalize for input DNA. For standardization the primers and the probe for analysis of the 
ACTB gene lack CpG dinucleotides so that amplification is possible regardless of methylation 
levels. As there are no methylation variable positions, only one probe oligonucleotide is re- 
quired. 

The following oligonucleotides were used in the reaction: 

Primer: TGGTGATGGAGGAGGTTTAGTAAGT (SEQ ID NO: 1088) 

Primer: AACCAATAAAACCTACTCCTCCCTTAA (SEQ ID NO: 1089) 

Probe: 6FAM-ACCACCACCCAACACACAATAACAAACACA-TAMRA or Dabcyl (SEQ 

ID NO: 1090) 

The extent of methylation at a specific locus was determined by the following formula: 

methylation rate= 1 00 * I cg / (Icg + Itg) 

(I = Intensity of the fluorescence of CG-probe or TG-probe) 

Gene PITX2 
Primers: 

PITX2R02: GTAGGGGAGGGAAGTAGATGTT (SEQ ID NO: 1091) 
PITX2Q02: TTCTAATCCTCCTTTCCACAATAA (SEQ ID NO: 1092) 
Amplificate length : 143 bp 
Probes: 

PITX2cgl : FAM-AGTCGGAGTCGGGAGAGCGA-Darquencher (SEQ ID NO: 1093) 
PITX2tgl : YAKIMA YELLOW-AGTTGGAGTTGGGAGAGTGAAAGGAGA- 
Darquencher (SEQ ID NO: 1094) 
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PGR components: 3 mM MgC12 buffer, lOx buffer, Hotstart TAQ 
Program (45 cycles): 95 °C, 10 min; 95 °C, 15 sec; 62 °C, 1 min 

Figure 54 shows the Kaplan-Meier estimated disease-free survival curves for a CpG position 
of the PITX2 gene by means of Real-Time methylation specific probe analysis. The lower 
plot shows the proportion of disease free patients in the population with above median meth- 
ylation levels, the upper plot shows the proportion of disease free patients in the population 
with below median methylation levels. The X axis shows the disease free survival times of the 
patients in months, and the Y- axis shows the proportion of disease free survival patients. The 
p-value (probability that the observed distribution occurred by chance) was calculated as 
0.0031, thereby confirming the data obtained by means of array analysis according to figure 
6. 

Example 2 

In order to validate the most promising markers from the microarray study of Example 1 
Real-Time assays were designed and optimised in order to provide assays of optimum accu- 
racy. The assays were run on a combination of paraffin embedded tissue (hereinafter also re- 
ferred to as PET) and fresh frozen tissue samples. DNA derived from PET is often of 'lower 
quality 5 (e.g. higher degree of DNA fragmentation and low DNA yield from samples), thus 
confirmation of assay results on PET demonstrates the robustness of the assay and increased 
utility of the marker. 

Quantitative methylation assays were designed for the genes ERBB2 (SEQ ID NO: 5), TFF1 
(SEQ ID NO: 12), PLAU (SEQ ID NO:16), PITX2 (SEQ ID NO:23), ONECUT2 (SEQ ID 
NO:35), TBC1D3 (SEQ ID NO: 43), and ABCA8 (SEQ ID NO: 49) and tested using a sam- 
ple set of 415 estrogen receptor positive node negative samples untreated breast cancer pa- 
tients and 541 estrogen receptor positive node negative samples Tamoxifen treated samples. 
Approximately 1 00 of these samples were previously analysed in the microarray study. 

The QM assay (= Quantitative Methylation Assay) is a Real-time PGR based method for 
quantitative DNA methylation detection. The assay principle is based on non-methylation 
specific amplification of the target region and a methylation specific detection by competitive 
hybridization of two different probes specific for the CG or the TG status, respectively. For 
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the present study, TaqMan probes were used that were labeled with two different fluorescence 
dyes ("FAM" for CG specific probes, "VIC" for TG specific probes) and were further modi- 
fied by a quencher molecule ("TAMRA" or "Minor Groove Binder/non-fluorescent 
quencher"). 

Evaluation of the QM assay raw data is possible with two different methods: 

1 . Measuring absolute fluorescence intensities (FI) in the logarithmic phase of amplifi- 
cation 

2. Difference in threshold cycles (Ct) of CG and TG specific probe. 
Results of this study were generated by using the Ct method. 

In the following series of quantitative methylation assays the amount of sample DNA ampli- 
fied is quantified by reference to the gene GSTP1 to normalize for input DNA. For standardi- 
zation, the primers and the probe for analysis of the GSTP 1 gene lack CpG dinucleotides so 
that amplification is possible regardless of methylation levels. As there are no methylation 
variable positions, only one probe oligonucleotide is required. 



Sample Sets 

ER+ NO Untreated Population 

To demonstrate that the markers identified have a strong prognostic component, ER+ NO tu- 
mor samples from patients not treated with any adjuvant therapy were analyzed. Markers that 
are able to show a significant survival difference in this population are considered to be prog- 
nostic. All 508 samples of this set were obtained from an academic collaborator as cell nuclei 
pellets (fresh frozen samples). The sample population can be divided into two subsets: One 
with 415 randomly selected samples (from both censored and relapsing patients), representing 
a population with a natural distribution of relapses, and additional 93 samples from relapsing 
patients only. The latter samples were used for sensitivity/specificity analyses only. 

Figure 98 shows the disease-free survival of the randomly selected population in a Kaplan- 
Meier plot and Figure 99 the distribution of follow-up times for the relapsed and censored 
patients in histograms. Table 4 lists the number of events broken down by different kinds of 
relapse. In summary, the survival of this population is comparable to the expected one from 
the literature. 
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ER+ NO TAM treated Population 

One intended target population of the invention is patients with ER+ NO tumors that are 
treated with hormone therapy. To check the performance of the marker candidates in this 
population, 589 samples from ER+ NO tumors from patients treated with Tamoxifen were 
analyzed. All samples were received as Paraffin-embedded tissues (PET). Three to ten 1 0 \xm 
sections were provided. 

In addition, for 89 PET patient samples matching fresh frozen samples from the same tumor 
were included into the study as controls. As these samples were already used in phase 1, they 
allowed for two kinds of concordance studies : 

• Chip versus QM assay 

• Fresh frozen versus PET samples 

Samples of the ER+, NO, TAM treated population were received from eight different provid- 
ers. Altogether 589 samples were processed, 48 of which had to be excluded from the study 
due to various reasons (e.g. two samples from same tumor, samples from patients that did not 
fulfill inclusion criteria etc.). 

Figure 100 shows the disease-free survival of the total population in a Kaplan-Meier plot and 
Figure 101 the distribution of follow-up times for the relapsed and censored patients in histo- 
grams. Table 5 lists the number of events broken down by different kinds of relapse. 
In summary, the survival of this population (82.1 % after 10 years) is comparable to the ex- 
pected one from the literature (79.2 %). 

DNA Extraction 

DNA extraction from Fresh Frozen Samples 

From a total of 508 fresh frozen samples available as cell nuclei pellets, genomic DNA was 
isolated using the QIAamp Kit (Qiagen, Hilden, Germany). The extraction was done accord- 
ing to the Cell Culture protocol using Proteinase K with few modifications. 

DNA extraction from PET Samples 

589 provided PET samples were deparaffmated directly in the tube in which they were deliv- 
ered by the providers. The tissue was then lysed and DNA extracted using the QIAGEN 
DNeasy Tissue kit. 
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Bisulfite treatment 

Bisulfite treatment was carried out based on the method disclosed by Olek et al. Nucleic Ac- 
ids Res. 1996 Dec 15;24(24):5064-6, and optimised to the applicant's laboratory workflow. 

Quantification Standards 

The reactions are calibrated by reference to DNA standards of known methylation levels in 
order to quantify the levels of methlyation within the sample. The DNA standards were com- 
posed of bisulfite treated phi29 amplified human genomic DNA (Promega) (i.e. un- 
methlyated), and/or phi29 amplified genomic DNA treated with Sssl Methylase enzyme 
(thereby methylating each CpG postion in the sample), which is then treated with bisulfite 
solution. Seven different reference standards were used with 0%, (i.e. phi29 amplified geno- 
mic DNA only), 5% 5 10%, 25%, 50%, 75% and 100% (i.e. phi29 Sssl treated genomic only). 

2000 ng batches of human genomic DNA (Promega) were treated with bisulfite. To generate 
methylated MDA DNA, 13 tubes of 4.5 jag MDA-DNA (700ng/jal) was treated with Sssl. 

Control assay 

The GSTP1-C3 assay design makes it suitable for quantitating DNAs from different sources, 
including fresh/frozen samples, remote samples such as plasma or serum, and DNA obtained 
from archival specimen such as paraffin embedded material. 

The following oligonucleotides were used in the reaction to amplify the control amplificate: 

Control Primerl: GGAGTGGAGGAAATTGAGAT (SEQ ID NO: 1095) 

Control Primer2: CCACACAACAAATACTCAAAAC (SEQ ID NO: 1096) 

Control Probe: FAM-TGGGTGTTTGTAATTTTTGTTTTGTGTTAGGTT-TAMRA (SEQ 

ID NO: 1097) 

Cycle program (40 cycles): 95 °C, 10 min; 95 °C, 15 sec; 58 °C, 1 min 
Assay design and reaction conditions 

Two assays were developed for the analysis of the gene PITX2(SEQ ID NO: 23) 
Assay 1: 

Primers: GTAGGGGAGGGAAGTAGATGTT (SEQ ID NO: 1 098) 
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TTCTAATCCTCCTTTCCACAATAA (SEQ ID NO: 1099) 
Probes : F AM- AGTCGGAGTCGGGAGAGCGA-TAMRA (SEQ ID NO : 1 1 00) 

VIC-AGTTGGAGTTGGGAGAGTGAAAGGAGA -TAMRA (SEQ ID 

NO:1101) 



Amplicon: 

GtAGGGGAGGGAAGtAGATGttAGHlGGtfflAAGAGt 





GGAGAGGGGAttTGGi|GGGtAtTTAGGAGttAAtllAGGAGtAGGAG- 




Length of fragment: 143 bp 

Positions of primers, probes and CpG dinucleotides ar highlighted. 



PCR components (supplied by Eurogentec) : 3 mM MgC12 buffer, lOx buffer, Hotstart TAQ, 
200 uM dNTP, 625 nM each primer, 200 nM each probe 



Cycle program (45 cycles): 95 °C, 10 min; 95 °C, 15 sec; 62 °C, 1 min 
Assay 2 : 

Primers: AACATCTACTTCCCTCCCCTAC (SEQ ID NO: 1 1 02) 

GTTAGTAGAGATTTTATTAAATTTTATTGTAT (SEQ ID NO: 1103) 

Probes: FAM-TTCGGTTGCGCGGT-MGBNQF (SEQ ID NO: 1 1 04) 

VIC-TTTGGTTGTGTGGTTG- MGBNQF (SEQ ID NO: 1 105) 



Amplicon: 

GDA Gt a G a a a TTtt a n A a AtTn a tTGt A . A GTnr fr— flW GGlUlGtlllGtll AGj 

MGtl^ aMl GfrGGMATttAGGAGBkGtAtAG^gttMGGBAGMtBGGGG- 

r T An|^|AGtAGGGGMAilAGAAA^iAGGtAGGGGAGGGAAGtAGATGtt 

Length of fragment: 1 64 bp 

The positions of probes, primers and CpG positions are highlighted. 



The probes cover three co-methylated CpG positions. 

PCR components (supplied by Eurogentec): 2,5 mM MgC12 buffer, lOx buffer, Hotstart TAQ, 
200 uM dNTP, 625 nM each primer, 200 nM each probe 



Program (45 cycles): 95 °C, 10 min; 95 °C, 15 sec; 60 °C, 1 min 
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The extent of methylation at a specific locus was determined by the following formulas: 
Using absolute fluorescence intensity: methylation rate= 100*1 (CG) / (I(CG) + I(TG)) 
(I = Intensity of the fluorescence of CG-probe or TG-probe) 

Using threshold cycle Ct: methylation rate= 100*CG/(CG+TG)= 100/(1+TG/CG)= 
100/(l+2 A delta(ct)) 

(assuming PCR efficiency E=2; delta (Ct)= Ct (methylated) - Ct (unmethylated) ) 
Gene PLAU (SEQ ID NO: 16) 

Primer: GTTAGGTGTATGGGAGGAAGTA (SEQ ID NO : 1 1 06) 

TCCCTCCCCTATCTTACAA (SEQ ID NO: 1 107) 
Probes: FAM-ACCCGAACCCCGCGTACTTC-TAMRA (SEQ ID NO: 1 1 08) 

VIC-ACCCAAACCCCACATACTTCCACA-TAMRA (SEQ ID NO: 1109) 

Amplicon: 
GttAGGTG^ 

GAflHtTGTTGGGTttttTtCGtTGGAGATC 

MBA AOt AEc^G^ lB^ ^^TGAGlli-TGtA ACMXGGGGAGGGA 
Length of fragment: 1 66 bp 

The positions of probes, primers and CpG positions are highlighted. 

PCR components were supplied by Eurogentec : 2,5 mM MgC12 buffer, lOx buffer, Hotstart 
TAQ, 200 uM dNTP, 625 nM each primer, 200 nM each probe 

Program (45 cycles): 95 °C, 10 min; 95 °C, 15 sec; 60 °C, 1 min 

Gene ONECUT2 (SEQ ID NO: 35) 

Primer: GTAGGAAGAGGTGTTGAGAAATTAA (SEQ ID NO : 1 1 1 0) 

CCACACAAAAAATTTCTATACTCCT (SEQ ID NO: 1 1 1 1) 

Probes: FAM- ACGGGTAGAGGCGCGGGT -TAMRA (SEQ ID NO:l 1 12) 

VIC- ATGGGTAGAGGTGTGGGTTATATTGTTTTG-TAMRA (SEQ ID 

NO:1113) 



Amplicon: 
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GtTGtAGGtTt^ttTTTGtATTAAGH|GGBitTGATTGTCj 
GAGGAtTGGMGttiaHGGAGGGGA^GGfAGAGG 





a 1 


tTGG 


r;.?., 


At 





GG- 




G AGtiBGtTBBGtTtTTTGTGttTttTtT AGBBGtt A AG tTG 





GGTAtAGtttTt- 



Length of fragment: 266 bp 

The positions of probes, primers and CpG positions are highlighted. 



PCR components were supplied by Eurogentec : 3 mM MgC12 buffer, lOx buffer, Hotstart 
TAQ, 200 uM dNTP, 625 nM each primer, 200 nM each probe 
Program (45 cycles): 95 °C, 10 min; 95 °C, 15 sec; 60 °C, 1 min 



Gene ABCA8 (SEQ ID NO: 49) 

Primer: GTGAGGT ATTGGATTTAGTTTATTTG (SEQ ID NO : 1 1 1 4) 

CCCTAAATCTCATCCTAAAAACAC (SEQ ID NO: 1 1 15) 

Probes: FAM- TGAGGTTTCGGTTTTTAACGGTGG -TAMRA (SEQ ID NO: 1 1 1 6) 

VIC- TGAGGTTTTGGTTTTTAATGGTGGGAT -TAMRA (SEQ ID NO: 

1117) 
Amplicon: 

GTGAGGT AtTGGATTtAGtttATTTGGtttHlAAGttTtTGTTtTjlHGAATtlllGGTGtT- 
GTGGGT^^MPIMGMKlMGlSiiGAtTGGTGTttTMAG 
GTTTttTHGGGtTTTGGTGGGATHGTGTttTtAGGATGAGATTTAGGG 
Length of fragment: 168 bp 

The positions of probes, primers and CpG positions are highlighted. 



PCR components were supplied by Eurogentec : 3 mM MgC12 buffer, lOx buffer, Hotstart 
TAQ, 200 uM dNTP, 625 nM each primer, 200 nM each probe 
Program (45 cycles): 95 °C, 10 min; 95 °C, 15 sec; 62 °C, 1 min 



Gene ERBB2 (SEQ ID NO: 5) 

Primer: GGAGGGGGTAGAGTTATTAGTTTT (SEQ ID NO: 1118) 

ACTCCCAACTTCACTTTCTCC (SEQ ID NOT 1 19) 
Probes: FAM- TAATTTAGGCGTTTCGGCGTTAGG -TAMRA (SEQ ID NO : 1 1 20) 



WO 2005/059172 



- 105 - 



PCT/EP2004/0 1 4 1 70 



VIC- TAATTTAGGTGTTTTGGTGTTAGGAGGGA -TAMRA (SEQ ID 

NO:1121) 
Amplicon: 

GGAGGGGGTAGAGTTATTAGTTTTTGTATTTAGGGATTTTT^AGGAAAAGTGTG 
AGAAHGTTGTAGGlIIliS 

TGHBBBaAG AG AGGC i A ( i A A A (3 TG A A G TTG G G A G T 
Length of fragment: 144 bp 

The positions of probes, primers and CpG positions are highlighted. 

PCR components were supplied by Eurogentec: 2,5 mM MgC12 buffer, lOx buffer, Hotstart 
TAQ, 200 uM dNTP, 625 nM each primer, 200 nM each probe 
Program (45 cycles): 95 °C, 10 min; 95 °C, 15 sec; 62 °C, 1 min 

Gene TFF1 (SEQ ID NO: 12) 

Primer: AGTTGGTGATGTTGATTAGAGTT (SEQ ID NO: 1 1 22) 

CCCTCCCAATATACAAATAAAAACTA (SEQ ID NO: 1 123) 

Probes : F AM- AC ACCGTTCGTAA AA-MGBNFQ (SEQ ID NO : 1 1 24) 

VIC- ACACCATTCATAAAAT-MGBNFQ (SEQ ID NO: 1 125) 

Amplicon: 

agttggtgatgtl'galtagagtltttg'ragttttaaatgatttttttaattaattlt 
aaatttttagaatttat|BtataaaaaggttatattttttggagggaKtHatg 

GTATTAGGATAGAAGTATTAGGGGAlgg^^g^PlllM^rMAAATAGTA^ 

TTTTATTTGTATATTGGGAGGG 

Length of fragment: 1 89 bp 

The positions of probes, primers and CpG positions are highlighted. 

PCR components were supplied by Eurogentec: 2,5 mM MgC12 buffer, lOx buffer, Hotstart 
TAQ, 200 uM dNTP, 625 nM each primer, 200 nM each probe 
Program (45 cycles): 95 °C, 10 min; 95 °C, 15 sec; 60 °C, 1 min 

Gene TBC1D3 (SEQ ID NO: 43) 

Primer: TTTTTAGTTGGTTTTTATTAGGGTTTT (SEQ ID NO: 1 126) 

CCAACATATCCACCCACTTACT (SEQ ID NO: 1127) 
Probes: FAM- TTTCGACTAATCTCCCGCCGA-TAMRA (SEQ ID NO : 1 1 28) 
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VIC- TTTCAACTAATCTCCCACCAAATTTACTATCA-TAMRA 

(SEQIDNO: 1129) 
Amplicon: 

tTTttAGtTGGtTtttAttAGGGtTttAGAGtttAAGAtttAGtATtfM^GGHGtTtT- 

GGGAAGttTGGtAGtTtf^TAAtTttAAtATGttTtATTTGAtAGtAAATtfHGK^Mg 
B&lMiliaGAGtAAGTGGGTGGATATGtTGG 

Length of fragment: 142 bp 

The positions of probes, primers and CpG positions are highlighted. 

PCR components were supplied by Eurogentec: 4,5 mM MgC12 buffer, lOx buffer, Hotstart 
TAQ, 200 |LiM dNTP, 625 nM each primer, 200 nM each probe 
Program (45 cycles): 95 °C, 10 min; 95 °C, 15 sec; 60 °C, 1 min 

Each of the designed assays was tested on the following sets of samples: 

• Tamoxifen treated patients who relapsed during treatment (all relapses). 

• Tamoxifen treated patients who relapsed during treatment with distant metastases 
only. 

• Non-Tamoxifen treated patients who relapsed during treatment (all relapses). 

• Non-Tamoxifen treated patients who relapsed during treatment with distant metastases 
only. 

Raw Data Processing 

All analyses were based on CT evaluation (evaluation using fluorescence intensities are avail- 
able upon request). Assuming optimal real-time PCR conditions in the exponential amplifica- 
tion phase, the concentration of methylated DNA (C met h) can be determined by 

r 100 l-Q/nl 

where 

CT CG denotes the threshold cycle of the CG reporter (FAM channel) and 
CT TG denotes the threshold cycle of the TG reporter (VIC channel). 

The thresholds for the cycles were determined by human experts after a visual inspection of 
the Amplification Plots [ABI PRISM 7900 HT Sequence Detection System User Guide]. The 
values for the cycles ( CT CG and CT TG ) were calculated with these thresholds by the ABI 7900 

software. Whenever the amplification curve did not exceed the threshold, the value of the 
cycle was set to the maximum cycle, i.e. 50. 
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Statistical Methods 
Cox Regression 

The relation between disease-free survival times (DFS) (or metastasis free survival, MFS) and 
covariates are modeled using Cox Proportional Hazard models (Cox and Oates, 1984; Harrel, 
2001). 

The hazard, i.e. the instantaneous risk of a relapse, is modeled as 

hit | x) = ho {i)-exp(J3x) (3) 

and 

h(t | x\ 9 . . . yX k ) = h (f)-exp(j3xx\ + . . . + /3 k x k ) (4) 
for univariate and multiple regression analyses, respectively, where t is the time measured in 
months after surgery, h 0 (t) is the baseline hazard, x is the vector of covariates (e.g. measure- 
ments of the assays) and p is the vector of regression coefficients (parameters of the model). P 
will be estimated by maximizing the partial likelihood of the Cox proportional hazard model 
Likelihood ratio tests are performed to test whether methylation is related to the hazard. The 
difference between Xog(Likelihood) of full model and null-model is approximately Q 2 - 
distributed with k degrees of freedom under the null hypotheses □ i = . . . = = 0. 
The assumption of proportional hazards were checked by scaled Schoenfeld residuals (Ther- 
nau et al. 9 2000). For the calculation, analysis and diagnostic of the Cox Proportional Hazard 
Model the R functions coxph, coxph.zph of the "survival" package were used. 

Stepwise Regression Analysis 

For multivariate Cox regression models a stepwise procedure (Venables et al y 1999; Harrel, 
2001) was used in order to find sub-models including only relevant variables. Two effects are 
usually achieved by these procedures: 

• Variables (methylation rates) that are basically unrelated to the dependent variable 
(DFS/MFS) are excluded as they do not add relevant information to the model. 

• Out of a set of highly correlated variables, only the one with the best relation to the 
dependent variable is retained. 

Inclusion of both types of variables can lead to numerical instabilities and a loss of power. 
Moreover, the predictory performance can be low due to overfitting. 

The applied algorithm aims at minimizing the Akaike information criterion (AIC) which is 
defined as 

AIC = ^maximized log-likelihood + 2D #parameters. 
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The AIC is related to the predictory performance of a model, smaller values promise better 
performance. Whereas the inclusion of additional variables always improves the model fit and 
thus increases the likelihood, the second term penalizes the estimation of additional parame- 
ters. The best model will present a compromise model with good fit and usually a small or 
moderate number of variables. Stepwise regression calculation with AIC was done with the R 
function "step". 

Kaplan-Meier Survival Curves and Log-Rank Tests 

Survival curves are estimated from DFS/MFS data using the Kaplan-Meier method (Kaplan 
and Meier, 1958). Log-rank tests were used to test for differences of two survival curves, e.g. 
survival in hyper- vs. hypomethylated groups. For a description of this test see (Cox and 
Oates, 1984). For the Kaplan Meier Analysis the functions "survfit" and "survdiff" of the 
"survival" package were used. 

Independence of markers from other covariates 

To check whether our marker panel gives additional and independent information, other rele- 
vant clinical factors were included in the cox proportional hazard model and the p-values for 
the weights for every factor were calculated (Wald-Test) (Thernau et al, 2000). For the 
analysis of additional factors in the Cox Proportional Hazard model, the R function "coxph" 
was used. 

Correlation Analysis 

Pearson and Spearman correlation coefficients are calculated to estimate the concordance 
between measurements (e.g. methylation in matched fresh frozen and PET samples). 

Density Estimation 

For numerical variables, kernel density estimation was performed with a gaussian kernel and 
variable bandwidth. The bandwidth is determined using Silverman's "rule-of-thumb" (Sil- 
verman, 1986). For the calculation of the densities the R function "density" was used. 

Analysis of Sensitivity and Specificity 

For the analysis of sensitivity and specificity of single assays and marker panels ROCs were 
calculated. The calculation of the ROCs was done with two methods:The first method is to 
calculate sensitivity and specificity for a given threshold for the timeT Threshold . With that 
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threshold, true positives, false positives, true negatives and false negatives were defined and 
the values for sensitivity and specificity were calculated for different cutoffs of the model. 
Patients censored before T Threshold were excluded. The ROCs were calculated for different 

times T Threshold (3 year, 4 years, ... , 10 years). The second method is to calculate sensitivity 

and specificity by using the Bayes-formula based on the Kaplan-Meier estimates (Heagerty et 
ah, 2000) for the survival probabilities in the marker positive and marker negative groups for 
a given time T Threshold . The ROCs were calculated for different times T Threshold (3 year, 4 years, 

... , 10 years). 



k-fold Crossvalidation 

For the analysis of model selection and model robustness k-fold crossvalidation (Hastie et aL, 
2001) was used. The set of observation was split in k chunks by random. In turn, every chunk 
was used as a test set and the remaining k-1 chunks were used as training set. This procedure 
was repeated n times. 

Population Charts 

For the description of the relation between censoring and a covariate Population Charts 
(Mocks et aL, 2002) were used. The baseline of the covariate was calculated including all 
observations with event. For a given time t, the mean (in case of real variables like age) or the 
fraction (in case of categorical variables) for all censored patients in the risk set at time t was 
calculated and added to the baseline value. 



Technical Performance 
Comparison of Assay Replicates 

Each marker was measured in at least three replicates, variability between assay replicates 
was observed to be higher for PET than for fresh frozen samples. 

Concordance Study Fresh Frozen versus PET Samples 

Markers analyzed in this study (Example 2)were initially identified on a chip platform (Ex- 
ample 1) using fresh frozen samples. The ER+ NO untreated population was also analyzed on 
fresh frozen samples in Example 2. A concordance study should demonstrate that measured 
methylation ratios are comparable for fresh frozen and PET samples. For this purpose, 89 
fresh frozen samples from three different providers already used in the chip study were proc- 
essed again in parallel with a matching PET sample originating from the same tumor. 
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Figure 97 shows such a concordance study for marker candidate PITX2 assay 1 as a scatter 
plot between fresh frozen and PET samples (using the QM assay). The association between 
the paired samples is 0.81 (Spearman's rho). This analysis is based on n=89 samples. 

Results 

Evaluation of Single Markers 

Each of the eight established QM assays was used to measure the 508 samples from the NO, 
ER+ untreated patient population (random selection and additional relapses) in three repli- 
cates. After filtering of measuring points not fulfilling quality criteria and performing a Cox 
analyses, Kaplan-Meier survival curves and ROC curves for each single marker were gener- 
ated. 

Two different clinical endpoints were used for analyses: 

• Disease-free survival, i.e. using all kinds of relapses (distant metastasis, locoregional 
relapses, relapses at contralateral breast) as event. 

• Metastasis-free survival, i.e. treating only distant metastasis as an event. 

For analyzing the ER+, NO, TAM treated population, five marker candidates were analyzed 
on 541 samples from the NO, ER+ untreated patient population. Assays were measured in 
three replicates. Three assays that were measured on the untreated population (PITX2 -2, 
ONECUT, and ABCA8) were not measured due to the limited material that was available for 
the TAM treated population. These assays were rejected either because they performed bad in 
the untreated population (ONECUT2 and ABCA8) or in case of PITX2-II it performed sig- 
nificantly worse than the other assay of this marker (PITX2-I). After filtering of measuring 
points not fulfilling quality criteria Kaplan-Meier survival curves and ROC curves for each 
single marker were generated. 

Two different clinical endpoints were used: 

• Disease-free survival, i.e. using all kinds of relapses (distant metastasis, locoregional 
relapses, relapses at contralateral breast) as event. 

• Metastasis-free survival, i.e. treating only distant metastasis as an event. 

The Kaplan-Meier estimated disease-free survival or metastasis-free survival curves of each 
single assay are shown in Figures 55 to 80, and combinations of assays are shown in Figures 
81 to 96. The X axis shows the disease free survival times of the patients in years, and the Y- 
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axis shows the proportion of patients with disease free survival. The black plot shows the pro- 
portion of disease free patients in the population with above an optimised cut off point's 
methylation levels, the grey plot shows the proportion of disease free patients in the popula- 
tion with below an optimised cut off point's methylation levels. 

The following p-values (probability that the observed distribution occurred by chance) were 
calculated when the cut off was optimised. For cut-off optimization, the quantiles of both 
groups were shifted between 0.2 and 0.8 and the p-value for the separation of the curves was 
calculated for each quantile. The quantile with the lowest p-value was then the best cut-off. 
Percentage values refer to the methylation ratios at the cut-off point. 

Single gene assays 
Tamoxifen treated 

TAM treated (all relapses) ERBB2 (SEQ ID NO: 5) (Figure 55) : p-value 0.089; cut off point: 
1 .3% 

TAM treated (distant only) ERBB2 (SEQ ID NO: 5) (Figure 56): p-value 0.084; cut off point: 
0.1% 

TAM treated (all relapses) TFF1 (SEQ ID NO: 12) (Figure 57): p-value 0.037; cut off point: 
50.9% 

TAM treated (distant only) TFF1 (SEQ ID NO: 12) (Figure 58): p-value 0.029; cut off point: 
52.9% 

TAM treated (all relapses) PLAU (SEQ ID NO: 16) (Figure 59): p-value 0.056; cut off point: 
4.8% 

TAM treated (distant only) PLAU (SEQ ID NO: 16) (Figure 60): p-value 0.065; cut off point: 
4.8% 

TAM treated (all relapses) PITX2(SEQ ID NO:23) (Figure 61): p-value 0.01; cut off point: 
13.1% 

TAM treated (distant only) PITX2(SEQ ID NO:23) (Figure 62): p-value 0.0012; cut off point: 
14.3% 

TAM treated (all relapses) TBC1D3 (SEQ ID NO: 43) (assay II) (Figure 63): p-value 0.28; 
cut off point: 94.6% 

TAM treated (distant only) TBC1D3 (SEQ ID NO: 43) (assay II) (Figure 64): p-value 0.078; 
cut off point: 97% 
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Figure 103 shows the ROC plot at different times for marker model PITX2 (Assay 1) alone on 
ER+NO TAM treated population. Figure A shows the plot at 60 months, figure B shows the 
plot at 72 months, figure C shows the plot at 84 months and figure D shows the plot at 96 
months. Only distant metastasis are defined as events. Sensitivity (proportion of all relapsed 
patients in poor prognostic group) shown on the X-axis and specificity (proportion of all re- 
lapse free patients in good prognostic group) shown on the Y-axis are calculated from KM 
estimates, and the estimated area under the curve (AUG) is calculated. Values for median cut 
off (triangle) and best cut off (diamond, 0.42 quantile) are plotted. 
AUC 60 months: 0.6 
AUC 72 months: 0.69 
AUC 84 months: 0.69 
AUC 96 months: 0.67 

Figure 104 shows the ROC plot at different times for marker model TFF1 on ER+NO TAM 
treated population. Figure A shows the plot at 60 months, figure B shows the plot at 72 
months, figure C shows the plot at 84 months and figure D shows the plot at 96 months. Only 
distant metastasis are defined as events. Sensitivity (proportion of all relapsed patients in poor 
prognostic group) shown on the X-axis and specificity (proportion of all relapse free patients 
in good prognostic group) shown on the Y-axis are calculated from KM estimates for differ- 
ent thresholds (= 5, 6, 7 , 8 years) and the estimated area under the curve (AUC) is calculated. 
Values for median cut off (triangle) and best cut off (diamond, 0.78 quantile) are plotted. 
AUC 60 months: 0.7 
AUC 72 months: 0.65 
AUC 84 months: 0.61 
AUC 96 months: 0.64 

Figure 105 shows the ROC plot at different times for marker model PLAU on ER+NO TAM 
treated population. Figure A shows the plot at 60 months, figure B shows the plot at 72 
months, figure C shows the plot at 84 months and figure D shows the plot at 96 months. Only 
distant metastasis are defined as events. Sensitivity (proportion of all relapsed patients in poor 
prognostic group) shown on the X-axis and specificity (proportion of all relapse free patients 
in good prognostic group) shown on the Y-axis are calculated from KM estimates for differ- 
ent thresholds (= 5, 6, 7 , 8 years), and the estimated area under the curve (AUC) is calcu- 
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lated. Values for median cut off (triangle) and best cut off (diamond, 0.77 quantile) are plot- 
ted. 

AUC 60 months: 0.6 
AUC 72 months: 0.63 
AUC 84 months: 0.57 
AUC 96 months: 0.6 

Non Tamoxifen treated 

Non Tamoxifen treated (all relapses) ERBB2 (SEQ ID NO: 5) (Figure 65): p-value 0.21; cut 
off point: 0% 

Non Tamoxifen treated (distant only) ERBB2 (SEQ ID NO: 5) (Figure 66): p-value 0.23; cut 
off point: 0.6% 

Non Tamoxifen treated (all relapses) TFF1 (SEQ ID NO: 12) (Figure 67) : p-value 0.012; cut 
off point: 49.6% 

Non Tamoxifen treated (distant only) TFF1 (SEQ ID NO: 12) (Figure 68): p-value 0.016; cut 
off point: 45.4% 

Non Tamoxifen treated (all relapses) PLAU (SEQ ID NO: 16) (Figure 69): p-value 0.011; cut 
off point: 3.2% 

Non Tamoxifen treated (distant only) PLAU (SEQ ID NO: 16) (Figure 70): p-value 0.0082; 
cut off point: 5.5% 

Non Tamoxifen treated (all relapses) PITX2(SEQ ID NO:23) (I) (Figure 71): p-value 1.4e-06; 
cut off point: 35.4% 

Non Tamoxifen treated (distant only) PITX2(SEQ ID NO:23) (I) (Figure 72): p-value 1.7 e- 
05; cut off point: 41.2% 

Non Tamoxifen treated (all relapses) PITX2(SEQ ID NO:23) (II) (Figure 73): p-value 
0.00026; cut off point: 56.1% 

Non Tamoxifen treated (distant only) PITX2(SEQ ID NO:23) (II) (Figure 74): p-value 
0.0026; cut off point: 61.9% 

Non Tamoxifen treated (all relapses) ONECUT2 (SEQ ID NO:35) (Figure 75): p-value 0.26; 
cut off point: 0% 

Non Tamoxifen treated (distant only) ONECUT2 (SEQ ID NO:35) (Figure 76): p-value 0.77; 
cut off point: 0% 

Non Tamoxifen treated (all relapses) TBC1D3 (SEQ ID NO: 43) (Figure 77): p-value 0.004; 
cut off point: 98.6% 
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Non Tamoxifen treated (distant only) TBC1D3 (SEQ ID NO: 43) (Figure 78): p-value 
0.00022; cut off point: 98.6% 

Non Tamoxifen treated (all relapses) ABCA8 (SEQ ID NO: 49) (Figure 79): p-value 0.0065; 
cut off point: 60.9% 

Non Tamoxifen treated (distant only) ABCA8 (SEQ ID NO: 49) (Figure 80): p-value 0.15; 
cut off point: 49.2% 

Panels 

Based on the results of the single marker evaluations, it was decided to build models using the 
marker candidates PITX2-Assay I, TFF1, and PLAU. All possible combinations of these 
markers were evaluated 

Tamoxifen treated 

TAM treated (all relapses) TFF1 (SEQ ID NO: 12) and PLAU (SEQ ID NO: 16) (Figure 81): 
p-value 0.023; cut off point: 0.7 quantile 

TAM treated (distant only) TFF1 (SEQ ID NO: 12) and PLAU (SEQ ID NO: 16) (Figure 82): 
p-value 0.00084; cut off point: 0.72 quantile 

TAM treated (all relapses) TFF1 (SEQ ID NO: 12) and PLAU (SEQ ID NO: 16) and 

PITX2(SEQ ID NO:23) (Figure 83): p-value 0.037; cut off point: 0.72 quantile 

TAM treated (distant only) TFF1 (SEQ ID NO: 12) and PLAU (SEQ ID NO: 16) and 

PITX2(SEQ ID NO:23) (Figure 84): p-value 0.0014; cut off point: 0.4 quantile 

TAM treated (all relapses) PITX2(SEQ ID NO:23) and TFF1 (SEQ ID NO: 12) (Figure 85): 

p-value 0.17; cut off point: 0.78 quantile 

TAM treated (distant only) PITX2(SEQ ID NO:23) and TFF1 (SEQ ID NO: 12) (Figure 86): 
p-value 0.0048; cut off point: 0.32 quantile 

TAM treated (all relapses) PITX2(SEQ ID NO:23) and PLAU (SEQ ID NO: 16) (Figure 87): 
p-value 0.1; cut off point: 0.74 quantile 

TAM treated (distant only) PITX2(SEQ ID NO:23) and PLAU (SEQ ID NO: 16) (Figure 88): 
p-value 0.0081; cut off point: 0.44 quantile 

Figure 102 shows the ROC plot at different times for marker model PITX2 (Assay 1) and 
TFF1 on ER+N0 TAM treated population. Figure A shows the plot at 60 months, figure B 
shows the plot at 72 months, figure C shows the plot at 84 months and figure D shows the plot 
at 96 months. Only distant metastasis are defined as events. Sensitivity (proportion of all re- 
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lapsed patients in poor prognostic group) shown on the X-axis and specificity (proportion of 
all relapse free patients in good prognostic group) shown on the Y-axis are calculated from 
KM estimates, and the estimated area under the curve (AUC) is calculated. Values for median 
cut off (triangle) and best cut off (diamond, 0.32 quantile) are plotted. 
AUC 60 months: 0.62 
AUC 72 months: 0.67 
AUC 84 months: 0.63 
AUC 96 months: 0.65 

Non Tamoxifen treated 

Non Tamoxifen treated (all relapses) TFF1 (SEQ ID NO: 12) and PLAU (SEQ ID NO:16) 
(Figure 89): p-value 0.0015; cut off point: 0.78 quantile 

Non Tamoxifen treated (distant only) TFF1 (SEQ ID NO: 12) and PLAU (SEQ ID NO:16) 
(Figure 90): p-value 0.003; cut off point: 0.8 quantile 

Non Tamoxifen treated (all relapses) TFF1 (SEQ ID NO: 12) and PLAU (SEQ ID NO:16) 

and PITX2(SEQ ID NO:23) (Figure 91): p-value 8.9e-07; cut off point: 0.64 quantile 

Non Tamoxifen treated (distant only) TFF1 (SEQ ID NO: 12) and PLAU (SEQ ID NO:16) 

and PITX2(SEQ ID NO:23) (Figure 92): p-value 5.4e-05; cut off point: 0.66 quantile 

Non Tamoxifen treated (all relapses) PITX2(SEQ ID NO:23) and TFF1 (SEQ ID NO: 12) 

(Figure 93): p-value 1.9e-06; cut off point: 0.72 quantile 

Non Tamoxifen treated (distant only) PITX2(SEQ ID NO:23) and TFF1 (SEQ ID NO: 12) 
(Figure 94): p-value 3.5e-05; cut off point: 0.76 quantile 

Non Tamoxifen treated (all relapses) PITX2(SEQ ID NO:23) and PLAU (SEQ ID NO: 16) 
(Figure 95): p-value l.le-06; cut off point: 0.68 quantile 

Non Tamoxifen treated (distant only) PITX2(SEQ ID NO:23) and PLAU (SEQ ID NO:16) 
(Figure 96): p-value 1.5e-05; cut off point: 0.64 quantile 

Robustness of marker models 

To evaluate the robustness of the models, a crossvalidation was performed on model marker 
panel PITX2 (Assay 1) plus TFF1 and marker panel PITX2 (Assay 1) alone, with 200 repli- 
cates. The stability of the assignment of one certain patient to the bad or good outcome group 
is illustrated in Figure 106 5 the left hand figure shows model marker panel PITX2 (Assay 1) 
plus TFF1 and the right hand figure shows model marker panel PITX2 (Assay 1) alone. The 
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plot illustrates in how many crossvalidation replicates each patient get's assigned to group 1 
(light grey) or group 2 (dark grey). 

Figure 107 illustrates the amino acid sequence of the polypeptide encoded by the gene PITX2. 

Figure 108 illustrates the positions of the amplificates sequenced in Example 3. 'A 5 shows an 
illustration of the gene with the major exons annotated, 'B' shows annotated mRNA transcript 
variants and 4 C shows CpG rich regions of the gene.The positions of Amplificates 1 to 1 1 are 
shown to the right of the illustrations. 

Figure 109 shows the sequencing data of 1 1 amplificates of the gene PITX2 according to Ex- 
ample 3. Each column of the matrices of columns ' A' and 6 B 'represent the sequencing data 
for one amplificate. The amplificate number is shown to the left of the matrices. Each row of 
a matrix represents a single CpG site within the fragment and each column represents an indi- 
vidual DNA sample. The matrices in the column marked c A' showed below median mehtyla- 
tion as measured by QM assays, the matrices in the column marked 'B' showed below median 
mehtylation as measured by QM assays. The bat* on the left represents a scale of the percent 
methylation, with the degree of methylation represented by the shade of each position within 
the column from black representing 100% methylation to light grey representing 0% meth- 
ylation. White positions represented a measurement for which no data was available. 

Figure 110 shows a schematic view of mRNA transcript variants of PITX2, as annotated in 
the on-line Ensembl database. 

Example 3: Sequencing of gene PITX2 

Sequencing of the gene PITX2 was carried out in order to confirm that co-methylation of 
CpG positions correlated across all exons. For bisulfite sequencing amplification primers 
were designed to cover 11 sequences within the gene PITX2, see Figure 108 for further de- 
tails. Sixteen samples analysed in Example 4 were utilized for amplicon production. Each 
sample was treated with sodium bisulfite and sequenced. Sequence data was obtained using 
ABI 3700 sequencing technology. Obtained sequence traces were normalized and percentage 
methylation calculated using the Applicant's proprietary bisulphite sequence sequencing trace 
analysis program (See WO 2004/000463 for further information). 
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Samples 

Eight samples displayed hypermethylation and eight samples displayed hypomethylation in 
analysis using QM assay II as described in example 2. 

Amplification 

Fragments of interest were amplified using the following conditions 

PCR Reaction solution : 

Taq 5U/|^1 0,2 

dNTPs 25mM each 0,2 

1 Ox buffer 2,5 

water 10,1 

primer (6,25 |uM) 2 

DNA (lng/|nl) 10 

Cycling conditions: 

15min 95 °C 

30s 95°C 

30s 58°C 

l:30min 72°C 

40 cycles 

Sequencing 

Only G-rich primers were used for sequencing with one exception: Amplificate Number 2 
was sequenced using both forward and reverse primer. 

ExoSAP-IT Reaction solution: 
4|al PCR product + 2\xl ExoSAP-IT 
45min/37°C and 15min/95°C 

Cycle sequencing: 
1 jil BigDye v. 1 . 1 
1 |jl water 
4 |al Sanger buffer 
4 jllI dNTP mix (0,025 mM each) 
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lOjLLl 

+ 

5 |nl Primer (2pmol/]Lil) 

6jlx1 ExoSAP-IT product 
Cycling 

2 min 96°C, 26 cycles a (30 s/96°C, 15s/55°C, 4 min/60°C) 
Purification 

A 96 well MultiScreen (Millipore) plate was filled with Sephadex G50 (Amersham) using an 
appropriate admeasure device. 300|ul water were added to each well and incubated 3h at 4°C. 
Water was removed by spinning for 5minutes at 910g. Cycle sequencing product was loaded 
to the plate and purified by spinning for 5min at 910g. 10|nl of formamide was added to each 
eluate. 

Results: 

All PCRs yielded a product. Figure 109 provides matrices produced from bisulfite sequencing 
data analysed by the the applicant's proprietary software (See WO 2004/000463 for further 
information). Each column of the matrices of columns 'A' and 'B'represent the sequencing 
data for one amplificate. The amplificate number is shown to the left of the matrices. Each 
row of a matrix represents a single CpG site within the fragment and each column represents 
an individual DNA sample. The matrices in the column marked 'A' showed below median 
mehtylation as measured by QM assays (see example 4), the matrices in the column marked 
'B 5 showed below median mehtylation as measured by QM assays. The bar on the left repre- 
sents a scale of the percent methylation, with the degree of methylation represented by the 
shade of each position within the column from black representing 1 00% methylation to light 
grey representing 0% methylation. White positions represented a measurement for which no 
data was available. 

Bisulfite sequencing indicated differential methylation of CpG sites between the two selected 
classes of samples, furthermore co-methylation was observed across the gene. In particular 
amplificates 4 to 7 showed a high level of differential methylation between the two analysed 
groups. 
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Table 1 



Accession 
no. 


Gene name/loci 


Genomic ! 
SEQ ID 

NO: } 


Pretreated ] 
nethylated i 
sequence s 
[sense) SEQ 
[D NO: ! 


'retreated 
nethylated 
strand (an- i 
tisense) 
3EQ ID NO: 


Pretreated 
irnmethylated 
sequence ! 
(sense) SEQ 
ID NO: 


'retreated 
iinmethylated 
sequence (an- 
tisense) SEQ 
[D NO: 


NM 001965: 


EGR4 




206 


207 . 


328 


329 


NM 000038 


arc : 


2 : 


208 : 


209 


330 


331 


NM 000077 


CDKN2A 


3 : 


210 : 


211 


332 


333 


NM 004385 


CSPG2 


4 


212 


213 


334 


335 


NM 004448 


ERBB2 


5 


214 


215 


336 


337 


NM 005563 


STMN1 


6 


216 


217 


338 


339 


NM 000455 


STK11 


7 


218 


219 


340 


341 


NM 001216 


CA9 


8 


220 


221 


342 


343 


NM 001604 


PAX6 


9 


222 


223 


3 44 


345 


NM 006142 


SFN 


10 


224 


225 


346 


347 


NM 005978 


S100A2 


11 


226 


227 


348 


349 


NM 003225 


TFF1 


12 


228 


229 


350 


351 


NM 003242 


TGFBR2 


13 


230 


231 


352 


353 


NM 000546 


TP53 


14 


232 


233 


354 


355 


NM 005427 


TP73 


15 


234 


235 


s*\ r~ s~ 

356 


357 


NM 002658 


PLAU 


16 


236 


237 


358 


359 


NM 016192 


TMEFF2 


17 


238 


239 


360 


361 


NM 000125 


ESR1 


18 


240 


241 


362 


363 


NM 003177 


SYK 


19 


242 


243 


364 


365 


NM 001540 


HSPB1 


20 


244 


245 


366 


367 


NM 007182 


RASSF1 


21 


246 


247 


368 


369 


NM 015641 


TES 


22 


248 


249 


370 


371 


NM 000325 


PITX2 


23 


250 


251 


372 


373 


NM 000836 


GRIN2D 


24 


252 


253 


374 


375 


NM 021154 


PSAT1 


25 


254 


255 


376 


377 


NM 000735 


CGA 


26 


256 


257 


378 


379 


NM 000106 


CYP2D6 


27 


258 


259 


380 


381 


NM 004718 


COX7A2L 


28 


260 


261 


382 


383 


NM 001437 


ESR2 


29 


262 


263 


384 


385 


NM 002658 


PLAU 


30 


264 


265 


386 


387 


NM 000638 


VTN 


31 


266 


267 


A 

388 


389 


NM 001055 


SULT1A1 


32 


268 


269 


390 


391 


NM 003884 


PCAF 


33 


270 


271 


392 


393 


NM 006254 


PRKCD 


34 


272 


273 


39 4 


395 


NM 004852 


, ONECUT2 


35 


274 


275 


396 


397 


NM 001706 


BCL6 


36 


216 


277 


J9o 




NM 016312 


,WBP11 


37 


278 


279 


400 


401 


NM 002462 


,MX1 


38 


280 


281 


402 


403 


NM 138433 


MX1 


39 


282 


283 


404 


405 


NM 000484 


APP 


40 


284 


285 


406 


407 


NM 002552 


ORC4L 


41 


286 


287 


408 


409 


NM 13899S 


NETOl 


42 


288 


289 


410 


411 
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NM 032258 


TBC1D3 


43 


290 


291 


412 


413 


NM 005310 


GRB7 


44 


292 


293 


414 


415 


NM 000106 


CYP2D6 


45 


294 


295 


416 


417 


NM 001259 


CDK6 


46 


296 


297 


418 


419 




Semience located 
within Chr 

lpl3.2 


47 


298 


299 


420 


421 




Sequence located 
within Chr 
17q25.1 


48 


300 


301 


422 


423 


NM 007168 


ABCA8 


49 


302 


303 


424 


425 




Slpmience located 
within Chr 
12ql4.3 


50 


304 


305 


426 


427 




Sequence located 
within Chr. 
8ql2.1 


51 


306 


307 


428 


429 


NM 017490 


MARK2 


52 


308 


309 


430 


431 


NM 005229 


ELK1 


53 


310 


311 


432 


433 




Q8WUT3 


54 


312 


313 


434 


435 


NM 000737 


CGB 


55 


314 


315 


436 


437 


NM 001728 


BSG 


56 


316 


317 


438 


439 


NM 005881 


BCKDK 


57 


318 


319 


440 


441 


NM 014587 


SOX8 


58 


320 


321 


442 


443 


NM 004393 


DAG1 


59 


322 


323 


444 


445 


NM 020210 


SEMA4B 


60 


324 


325 


446 


447 


NM 000125 


ESR1 (exon8) 


61 


204 


327 


448 


449 


NM 000325 


PITX2 


1130 


1132 


1133 


1136 


1137 


NM 003225 


TFF1 


1131 


1134 


1135 


1138 


1139 



Table 2 Primers and amplificates according to Example 1 



Gene: 


Primer: 


Amplificate Length: 


EGR4 (SEQ ID NO: 1) 


AGGGGGATTGAGTGTTAAGT 
(SEQ ID NO: 450) 
CCCAAACATAAACACAAAAT 
(SEQ ID NO: 451) 


294 


APC (SEQ ID NO: 2) 


TCAACTACCATCAACTTCCTT 
A 

(SEQ ID NO: 452) 

AATTTATTTTTAGTGTTGTAGT 

GGG 

(SEQ ID NO: 453) 


491 






CDKN2A 
(SEQ ID NO: 3) 


GGGGTTGGTTGGTTATTAGA 
(SEQ ID NO: 454) 

AACCCTCTACCCACCTAAAT 
(SEQ ID NO: 455) 


256 


CSPG2 

(SEQ ID NO: 4) 


GGATAGGAGTTGGGATTAAG 
AT 


414 
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(SEQ ID NO: 456) 

AAATCTTTTTCAACACCAAAA 

T 

(SEQ ID NO: 457) 




F"RRR? 

J_> XVX-> XJ 

(SEQ ID NO: 5) 


GGAGGGGGTAGAGTTATTAG 

VJ VJ .TiAJ VJ VJ VJ VJ X /i.VJ ziVJ X X xV X X AVJ 

TT 

(SEQ ID NO: 458) 
TATACTTCCTCAAACAACCCT 

C 

(SEQ ID NO: 459) 


257 


O X 1VX1N X 

(SEQ ID NO: 6) 


G A GTTTfrT A TTT A A GTTG A GT 

vXrYVJ 111 VJ 1 1\ 111 i\ . \ v_ J 1 1 vj rWJ 1 

GGTT 

(SEQ ID NO: 460) 

AACAAAACAATACCCCTTCTA 

A 

(SEQ ID NO: 461) 




STMN1 

O X 1VXIN X 

(SEQ ID NO: 6) 


PCTPTTACTA AGPTG A AGG A A 

c 

(SEQ ID NO: 463) 

GAAAGGTAGGGAAGGATTTT 

T 

fSEO ID NO- 462"! 


i j i 


STK11 

(SEQ ID NO: 7) 


TAAAAGAAGGATTTTTGATTG 
G 

(SEQ ID NO: 464) 
CATCTTATTTACCTCCCTCCC 
(SEQ ID NO: 465) 


528 


gaq 

(SEQ ID NO: 8) 


GGG A A GT A GGTT A GGGTT A G 

TT 

(SEQ ID NO: 466) 

AAATCCTCCTCTCCAAATAAA 

T 

fSEO ID NO- 467") 








PAX6 

(SEQ ID NO: 9) 


GGAGGGGAGAGGGTTATG 

(SEQ ID NO: 468) 

TACTATACACACCCCAAAACA 
A 

Ir\. 

(SEQ ID NO: 469) 


374 


SFN 

(SEQ ID NO: 10) 


GAAGAGAGGAGAGGGAGGTA 
(SEQ ID NO: 470) 
CTATCCAACAAACCCAACA 
(SEQ ID NO: 471) 


489 


S1 00 A2 

(SEQ ID NO: 11) 


GTTTTTA A GTTGG A G A A G A GG 
A 

(SEQ ID NO: 472) 

ACCTATAAATCACAACCCACT 

C 

(SEQ ID NO: 473) 


460 


TFF1 

(SEQ ID NO: 12) 


TTGGTGATGTTGATTAGAGTT 
T 

(SEQ ID NO: 474) 


449 
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TAAAACACCTTACATTTTCCC 
T 

(SEQ ID NO: 475) 




TGFBR2 

(SEQ ID NO: 13) 


^— 1 r-|-N A A T^T^T" 1 J — \ A A / — < AAA / ^ J 1 " 1 '/"""J A / ^ 

GTAATTTGAAGAAAGTTGAG 
GG 

(SEQ ID NO: 476) 
CCAACAACTAAACAAAACCT 

CT 

(SEQ ID NO: 477) 


296 


« ■ ■ ■ ^ mm m^ 

TP53 

(SEQ ID NO: 14) 


TTGATGAGAAGAAAGGATTT 
AGT 

(SEQ ID NO: 478) 
TCAAATTCAATCAAAAACTTA 

CC 

(SEQ ID NO: 479) 


496 


TP73 

(SEQ ID NO: 15) 


A ✓ — \ r»-i AAA T< A t — 1 T> /"""I /~1 / t it 1 / ^ A / 1 TT A T* 

AGTAAATAGTGGGTGAGTTAT 
GAA 

(SEQ ID NO: 480) 
GAAAAACCTCTAAAAACTACT 

CTCC 

(SEQ ID NO: 481) 


607 


PLAU 

(SEQ ID NO: 16) 


GAGAGAGA T AGTTGGGGAG 1 
TT 

(SEQ ID NO: 482) 
CAAACAAACTTCATCTACCAA 

A HP A 

ATAC 

(SEQ ID NO: 483) 


453 


TMEFF2 

(SEQ ID NO: 17) 


TGTTGGTTGTTGTTGTTGTT 
(SEQ ID NO: 484) 
CTTTCTACCCATCCCAAAA 
(SEQ ID NO: 485) 


319 


ESR1 

(SEQ ID NO: 18) 


CTATCAATTCCCCCAACTACT 

(SEQ ID NO: 487) 
TTGTTGGATAGAGGTTGAGTT 

T 

(SEQ ID NO: 486) 


349 


SYK 

(SEQ ID NO: 19) 


GTGGGTTTTGGGTAGTTATAG 
A 

(SEQ ID NO: 488) 
TAACCTCCTCTCCTTACCAA 
(SEQ ID NO: 489) 


485 


HSPB1 

(SEQ ID NO: 20) 


CCTACCTCTACCACTTCTCAA 
T 

(SEQ ID NO: 491) 
AAGAGGGTTTAGTTTTTATTT 

(SEQ ID NO: 490) 


216 


RASSF1 

(SEQ ID NO: 21) 


AGTGGGTAGGTTAAGTGTGTT 
G 

(SEQ ID NO: 492) 
CCCCAAAATCCAAACTAAA 


319 
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(SEQ ID NO: 493) 




TES 

(SEQ ID NO: 22) 


AGGTTGGGGATTTTAGTTTTT 

(SEQ ID NO: 494) 
ACCTTCTTCACTTTATTTTCCA 

(SEQ ID NO: 495) 


448 


PITX2 

(SEQ ID NO: 23) 


TCCTCAACTCTACAAACCTAA 
AA 

(SEQ ID NO: 497) 
GTAGGGGAGGGAAGTAGATG 

T 

(SEQ ID NO: 496) 


408 


GRIN2D 

(SEQ ID NO: 24) 


ATAGTTTGTGGTTTGGATTTTT 
(SEQ ID NO: 498) 
AAAACCTTTCCCTAACTTCAA 
T 

(SEQ ID NO: 499) 


435 


PS ATI 

(SEQ ID NO: 25) 


GTAGGTGGTTAATTTTGGGTT 
(SEQ ID NO: 500) 
CTCATTCACACTATATCCATT 
CA 

(SEQ ID NO: 501) 


500 


PS ATI 

(pr!A^ ID JNU. Zj ) 


TAAGAGAGAGGAGTTGAGGT 
TT 

(SEQ ID NO: 502) 
CCAAAATTAACCACCTACCTA 

A 

(SEQ ID NO: 503) 


478 






CGA 

(£>JbQ 1L) JNU. ZD) 


TAGTGGTATAAGTTTGGAAAT 
GTT 

(SEQ ID NO: 504) 

TCCACCTACATCTAAACCCTA 

A 

(SEQ ID NO: 505) 


364 






CYP2D6 (SEQ ID NO: 27) 


CCTCCTAAACTAAATCCAACA 
A (SEQ ID NO: 507) 
GGGGTTAAGGTTTTTATGGTA 
(SEQ ID NO: 506) 


418 


COX7A2L (SEQ ID NO: 28) 


AATCCTAAAAACCCTAACTTT 
TAAT (SEQ ID NO: 509) 
GGAGGTGTAAGGAGAATAGA 
GA (SEQ ID NO: 508) 


398 


ESR2 (SEQ ID NO: 29) 


AAACCTTCCCAATAACCTCTT 
A (SEQ ID NO: 511) 
TAGAGGGGAGTAGTGTTTGA 
GT (SEQ ID NO: 510) 


471 


PLAU (SEQ ID NO: 30) 


GTGATATTTGGGGATTGTTAT 
T (SEQ ID NO: 512) 
ACTCCCTCCCCTATCTTACA 
(SEQ ID NO: 513) 479 


479 


VTN (SEQ ID NO: 31) 


GTTATTTGGGTTAATGTAGGG 
A (SEQ ID NO: 514) 


492 
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TCTATCCCCTCAAACTTAAAA 
A (SEQ ID NO: 515) 




SULTl Al (SEQ ID NO: 32) 


AT ACT ACC AAAC C C ACTC AAA 
C (SEQ ID NO: 517) 
GAATTTAGGGAAGGAGTTAG 
TTG (SEQ ID NO: 516) 


448 


PCAF (SEQ ID NO: 33) 


GGATAAATGATTGAGAGGTT 
GT (SEQ ID NO: 518) 
CCTCCCTTAATTCTCCTACC 
(SEQ ID NO:- 519) 


369 


PRKCD (SEQ ID NO: 34) 


CTTAACCCATCCCAATCA 
(SEQ ID NO: 521) 
GATAGAAGGATTTTAGTTTTT 
ATTGTT (SEQ ID NO: 520) 


322 


ONECUT2 (SEQ ID NO: 35) 


TTTGTTGGGATTTGTTAGGAT 
(SEQ ID NO: 522) 

aaa " "4 A f i v i v i v i t A ^»^t y** r"i^ y™"t AAA y~*"4 

AAACATTTTACCCCTCTAAAC 
C (SEQ ID NO: 523) 


467 


BCL6 (SEQ ID NO: 36) 


CATCACCACTTCTAAAAACCC 
(SEQ ID NO: 525) 
GGGTAAGAAAGAAGGAATTA 
GTTT (SEQ ID NO: 524) 


456 


WBP11 (SEQ ID NO: 37) 


AAGAGGTGAGGAAGAGTAGT 
AAAT (SEQ ID NO: 526) 

»< m y~*< a A y*~^ A a m AAA tit .yi A A A A 

CTCCCAACAACTAAATCAAAA 
T (SEQ ID NO: 527) 


437 


MX1 (SEQ ID NO: 38) 


TGTAGGAGAGGTTGGGAAG 
(SEQ ID NO: 528) 

y — * y — \ AAA v~1 a rp A A y — i A A y^~"i f | ^ AAA 

CCAAACATAACATCCACTAAA 
A (SEQ ID NO: 529) 


341 


MX1 (SEQ ID NO: 39) 


TAGGTTTAAGAGGAGAGGGA 
AT (SEQ ID NO: 530) 

AAA y**<* A A ✓~>4 f i'i » y~*t y^-* y~>4 AAA m /^i y~>< A A 

AAACAACTACCCAAATCCAA 
C (SEQ ID NO: 531) 


433 


APP (SEQ ID NO: 40) 


GAGTAAGGAAGGGGGATG 
(SEQ ID NO: 532) 

A A y*i aaa m /^(mmrn a a r r*"i A AAA 

AACCCAAATCTTTAATACAAA 
AA (SEQ ID NO: 533) 


494 


NETOl (SEQ ID NO: 42) 


GGAGTTTTTAGAAGAGGAAG 
ATT (SEQ ID NO: 534) 

a >■ — ^ mrn * — ■* a y"-4 A A rn AAA m A y"*i y~t m s*^i 

ACTTCACAATAAATACCCTCC 
C (SEQ ID NO: 535) 


395 


TBC1D3 (SEQ ID NO: 43) 


GGTAGAGGAAGTAGTTGGTTT 
G (SEQ ID NO: 536) 
CTTTTATATTTCTCCCAATCTC 

C (oJbQ ID JNU: Di /) 


490 


GRB7 (SEQ ID NO: 44) 


AAAATCCATAACCACCAAAA 
TA (SEQ ID NO: 539) 
TTAGGAAGTTTTAGGAATGAG 
G (SEQ ID NO: 538) 


416 


CYP2D6 (SEQ ID NO: 45) 


AATTTCCTAACCCACTATCCT 


379 
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C (SEQ ID NO: 541) 
ATTTGTAGTTTGGGGTGATTT 

(SEQ ID NO: 540) 




CDK6 (SEQ ID NO: 46) 


ACCTTAAACACCTTCCCATAA 
(SEQ ID NO: 543) 
GTGTAATGATTTTGGATTGAG 
A (SEQ ID NO: 542) 


456 


SEQ ID NO: 47 


AAGGAAGGTAGAGGGTTGAG 
T (SEQ ID NO: 544) 
AAAATCCAAAATTAACACCAT 
T (SEQ ID NO: 545) 


499 


SEQ ID NO: 48 


AGTAGATGAAGTTGGGGATT 
AG (SEQ ID NO: 546) 
TCCTACTATCCCTTCTCAAAA 
A (SEQ ID NO: 547) 


500 


ABCA8 (SEQ ID NO: 49) 


TGATTGTGTAGATTATTTTTG 
GTT (SEQ ID NO: 548) 
CAAACTCTCTAAACCTCAATC 
TC (SEQ ID NO: 549) 


499 


SEQ ID NO: 50 


ACCCTAACATTCTCTAAACAA 
CA (SEQ ID NO: 551) 
GATGAAAGTGGAAAGATTAT 
GG (SEQ ID NO: 550) 


441 


SEQ ID NO: 51 


CTCCAACTCTCCTCACCTC 
(SEQ ID NO: 553) 
ATTTGAAGGTTGTGTTTGTAG 
A (SEQ ID NO: 552) 


343 


MARK2 (SEQ ID NO: 52) 


TCACCACTATCCTCAATAATC 
A (SEQ ID NO: 555) 
TAAAGTAGGAAGGTTTGGTTT 
G (SEQ ID NO: 554) 


476 


ELK1 (SEQ ID NO: 53) 


CCTCTAATTCCTATCAATCAC 
C (SEQ ID NO: 557) 
TTAGAAGTGAAAGTAGAAGG 
GTTT (SEQ ID NO: 556) 


435 


Q8WUT3 (SEQ ID NO: 54) 


GGTTAGAAGTTAGAGGGGTA 
GG (SEQ ID NO: 558) 
CCATCCCATTACCTATAAAAA 
T (SEQ ID NO: 559) 


406 


CGB (SEQ ID NO: 55) 


TCCACCCTATTTTCTACCAA 
(SEQ ID NO: 561) 
TTTGTTTTAGGTGGTGTGTAA 
T (SEQ ID NO: 560) 


417 


BSG (SEQ ID NO: 56) 


TTATCTATCCCCACACCCTAA 
T (SEQ ID NO: 563) 
GGAGTAGGTGAGGAGTATTTT 
G (SEQ ID NO: 562) 


420 


BCKDK (SEQ ID NO: 57) 


TCACCTCCTTTTACAACCAAT 
(SEQ ID NO: 565) 
TTTGGGAGAGTTTTAGGATTT 
A (SEQ ID NO: 564) 


258 
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SOX8 (SEQ ID NO: 58) 


GGGTGGGTAGTAGGTTTGTT 
(SEQ ID NO: 566) 
ACACACTCCTTAAAACTCTTC 
C (SEQ ID NO: 567) 


435 


DAG1 (SEQ ID NO: 59) 


AATACCAACCCAAACATCTAC 
C (SEQ ID NO: 569) 
TTTGGTTATGTGGAGTTTATT 
GT (SEQ ID NO: 568) 


315 


ORC4L (SEQ ID NO: 41) 


CACTCAAAACTTCCCTACCTA 
C (SEQ ID NO: 571) 
GGTAATGGTGGGGGTAAAT 
(SEQ ID NO: 570) 


489 

— - 


SEMA4B (SEQ ID NO: 60) 


ACCAAAATACTACTCCCAAAT 
C (SEQ ID NO: 573) 
GGGTAGAGGGAGGTTATTGTT 
(SEQ ID NO: 572) 


337 


ESR1 (exon8) (SEQ ID NO: 61) 


TATGATTTGTTGTTGGAGATG 
T (SEQ ID NO: 574) 
CTTAAAATCCCTTTAACTATT 
CCC (SEQ ID NO: 575) 


388 



Table 3 Hybridisation oligonucleotides according to Example 1 



Gene 


Oligo: 


ONECUT2 (SEQ ID NO: 35) 


T AC GT AGTTGC GCGTT (SEQ ID NO: 800) 


ONECUT2 (SEQ ID NO: 35) 


GTATGTAGTTGTGTGTT (SEQ ID NO: 801) 


ONECUT2 (SEQ ID NO: 35) 


TTTTGTGCGTACGGAT (SEQ ID NO: 802) 


ONECUT2 (SEQ ID NO: 35) 


TTGTGTGTATGGAT (SEQ JD NQ; gQ3) 


ONECUT2 (SEQ ID NO: 35) 


TTAAGCGGGCGTTGAT (SEQ ID NO: 804) 


ONECUT2 (SEQ ID NO: 35) 


TTAAGTGGGTGTTGAT (SEQ ID NO: 805) 


ONECUT2 (SEQ ID NO: 35) 


TAGAGGCGCGGGTTAT (SEQ ID NO: 806) 


ONECUT2 (SEQ ID NO: 35) 


TAGAGGTGTGGGTTAT (SEQ ID NO: 807) 


BCL6 (SEQ ID NO: 36) 


ATTTCGAAATATGTCGG (SEQ ID NO: 1004) 


BCL6 (SEQ ID NO: 36) 


ATTTTGAAATATGTTGGT (SEQ ID NO: 1005) 


BCL6 (SEQ ID NO: 36) 


ATTCGAGACGTTTTGT (SEQ ID NO: 1006) 


BCL6 (SEQ ID NO: 36) 


TTTGAGATGTTTTGTTTA (SEQ ID NO: 1007) 


BCL6 (SEQ ID NO: 36) 


TTCGAGTTTCGAATCGG (SEQ ID NO: 1008) 


BCL6 (SEQ ID NO: 36) 


TTTGAGTTTTGAATTGGA (SEQ ID NO: 1009) 


BCL6 (SEQ ID NO: 36) 


ATAGCGAAGGCGTCGA (SEQ ID NO: 1010) 


BCL6 (SEQ ID NO: 36) 


TATAGTGAAGGTGTTGA (SEQ ID NO: 101 1) 


WBP11 (SEQ ID NO: 37) 


TTACGAGAAGCGGGTA (SEQ ID NO: 946) 


WBP11 (SEQ ID NO: 37) 


ATTATGAGAAGTGGGTA (SEQ ID NO: 947) 


WBP11 (SEQ ID NO: 37) 


AGGGGGCGATTTTCGG (SEQ ID NO: 948) 


WBP11 (SEQ ID NO: 37) 


TAGGGGGTGATTTTTGG (SEQ ID NO: 949) 


WBP11 (SEQ ID NO: 37) 


TTAGCGTCGTTTGATT (SEQ ID NO: 950) 


WBP11 (SEQ ID NO: 37) 


TTTTAGTGTTGTTTGATT (SEQ ID NO: 951) 


WBP11 (SEQ ID NO: 37) 


AGTTCGTTTTATTGCGT (SEQ ID NO: 952) 


WBP11 (SEQ ID NO: 37) 


GAGTTTGTTTTATTGTGT (SEQ ID NO: 953) 


MX1 (SEQ ID NO: 38) 


AACGCGCGAAAGTAAA (SEQ ED NO: 576) 


MX1 (SEQ ID NO: 38) 


TTGGGAATGTGTGAAA (SEQ ID NO: 577) 
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Gene 


AN 7 • 

Ohgo: 


TV A~^\7"i / O ~T~* AN TFv TV TAX O ON ' 

MXl (SEQ ID NO: 38) 


THHA^A A AHnHTAN A /^ l T i /^1/^* A A / O T7 /~\ TTN 7v t/^N . C70\ 

lTCGAGTTGGG i CCjACjA (bEQ ID NO: 578) 


MXl (SEQ ID NO: 38) 


TTTGAGTTGGGT 1 GAGA (SEQ ID NO: 579) 


MX1 (SEQ ID NO: 38) 


TP a nnA A A AA A A A A A TT /C^T7 /~\ TTN \TA. CO AN 

TATGCGCGGGAAGAl 1 (SEQ ID NO: 580) 


x i"«ri / r~\ T"' S~~\ TT\ "X TAN O ON 

MXl (SEQ ID NO: 38) 


r~\ rp i rp/^iTi/nrp/-^ y^/n a A A HP / O T~? /~\ TTN XTA. CO 1 N 

GTATGTGTGGGAAGA 1 (SEQ ID NO: 581) 


» yr^V7--| / f~1 "1 — T T"P\ "V TAN O O \ 

MXl (SEQ ID NO: 38) 


A HP'T'T A /^TO -1 /^ /^/^* Z^ 1 /OT7/N TTN \TA - COON 

ATTTACGGTTGCGGGG (SEQ ID NO: 582) 


MXl (SEQ ID NO: 38) 


T A TT 1 /^ T/""^ T 1 /^ T'T' A /CCA TTN \TA . COON 

TATGGTTGTG1GGG1 1 A (SEQ ID NO: 583) 


TV /TT T 1 /OT" 'AN TTV "TvTAN. 1f\\ 

MXl (SEQ ID NO: 39) 


A A A AATTT A HP A T 1 /^/^ IT /O T7 (~\ TTN "KT^N . C O /1 N 

AGGCGTTTATAG 1 CGG 1 (SEQ ID NO: 584) 


TV /T\T 1 AN TTN XTAN. O H\ 

MXl (SEQ ID NO: 39) 


A T/~< TT'T A T A T v 1 T /OTJ/^N TTN \TM . C O CN 

AGG1G1 1 1A1AG1 1GG1 (bJbQ ID NO: 585) 


X yf"X7"1 /CT7A TTV XTA"N. 

MXl (SEQ ID NO: 39) 


TTT/'N /'"i A PTTArT A /^HT A /CT7/^N TTN \Tr\ . C O /TN 

I 1 1CGAG1 ICuuAulA (oh/Q 1JD NO: 5oo) 


TV jr~K7"t / Al T" 1 AN TT\ \TA 

MXl (SEQ ID NO: 39) 


HnnPTT '/o A /^TTT/^ A Z^ 1 T 1 A /Cr?/N TTN \TA . COTN 

ill 1GAG1 I 1GGAG1 AG (1SEQ ID NO: 58 /) 


TV /1"""V T 1 /AIT* AN TTV "XTAN Of\\ 

MXl (SEQ ID NO: 39) 


T< n r/~* r vr^r~^ /^ , t , /^/^t a r~*i~*f~* /cc r\ ttn "xt/^n. coon 

TTGTCGGTCG1 AGCGG (SEQ ID NO: 588) 


iv /ckt -i /n"n A t"T\ "X t a\ o fNN 

MXl (SEQ ID NO: 39) 


T"T"T A T^T A A< T"T A 1 T A Z"' 1 T/~i /"""< /OT7AV TTN NTAv. COf\N 

TTTGTTGGTTGTAG 1 GG (SEQ ID NO: 589) 


"IV /f"\7 1 /nT 1 A TT\ "XTAN O <NN 

MXl (SEQ ID NO: 39) 


TTOA*' 1 "r A A^ T A A^ /OTA TTN "XT AN . f AA\ 

TTCGTTACGGCGG 1 AG (SEQ ID NO: 590) 


IV /f~\ T 1 /ffPA TT\ \TA O A\ 

MXl (SEQ ID NO: 39) 


A ' r'TTA' 1 " 1 ' A T A^ A^ A^ A^ /O T7 AN TTN "NT AN . COIN 

AGT11GT1 A1GG1GG1 (SEQ ID NO: 591) 


A TNTN /OT" 1 AN TTv TvTAN /I A\ 

APP (SEQ ID NO: 40) 


TPA AAA A^A^ A A^ A" - * A^ A" 1 A"" 1 A A^ A /CT7A\ TTN "XT AN . C <NO \ 

TGAAACGAGGCGGAGA (SEQ ID NO: 592) 


A TNTN / O /"N TTN TvTAX /I A\ 

APP (SEQ ID NO: 40) 


TA""1 AAA HPA^ A A^A^T'A^A^ A A^ A /C"C?AN TTN "XT AN . rnO\ 

TGAAATGAGG I GGAGA (SEQ ID NO: 593) 


A TNTN /OTA TTv TvTAN A A"\ 

APP (SEQ ID NO: 40) 


r~\ A AAT r l'A AA r l " 1 " 1 " 1 'A^ A^ A^ /CD AN TTN X.TAN . CO/1 \ 

GACGI 1GCG1 1 1 1GGG (bEQ ID NO: 594) 


A TTNTN /OT'AN TTN "X TAN . A AN 

APP (SEQ ID NO: 40) 


A~» A^ A T A^ "I^H"'A^ HTA^ r P f 1 ir 1 v 1 "TA A^ / 0X7 AN TTN TvTAN . CftC\ 

GGA1G1 1G1GI111 1GG (orJ^O ID NO: 595) 


A TITi /npA tta \t/-v. /IAN 

APP (SEQ ID NO: 40) 


' P f 1 " 1 w TTT A AAP/^PTrTT' A /OT? AN TTN XTAN . Cf|/C\ 

1111 1AGGGGGIGGGA (1SEQ ID NO: 596) 


a TNTN /OPA TTN \TA /IAN 

APP (SEQ ID NO: 40) 


^rr^npnpnn A A^ T 1 A^ A^ 1 A^ THP A^ A^ A /C"CA TTN X.T AN . C m\ 

T1TT1AGIGGG1 1GGA (SEQ ID NO: 59/) 


A TNTN /OT'AN TTN "X TAN A f\\ 

APP (SEQ ID NO: 40) 


/~\ /~~\ a AN A"i T'T 1 AN A^ T 1 A A AAA 1 A* /C'C AN TTN "KTAN . COON 

GGACG1 1CG1AAGCGG (SEQ ID NO: 598) 


A T»T» /flTA T"T\ \TA /IAN 

APP (SEQ ID NO: 40) 


AN an A TAN TTT AN Hp A A ATAA /OT7AN TTN "XT AN . C HON 

GGA1G1 1 1G1 AAG1GG (SEQ ID NO: 599) 


A^\T» AN A T / C~\ "1 — » S~\ TTN "X T S~\ A "1 N 

ORC4L (SEQ ID NO: 41) 


T"TP A HP A an AN AN AN TT AN npHPHP A T /OT7AN TTN "X TAN . A AON 

TTATACGCGT1 GTT I A 1 (SEQ ID NO: 600) 


ANTN A"l AT /{NT 1 AN TTN "X T A\ A t N 

ORC4L (SEQ ID NO: 41) 


TANT A TT ATA TPTP TTANTTT /O T7 AN TTN "XT AN . A"0 1 N 

TGIA1 1A1A1G1G1 1G1 1 1 (SEQ ID NO: 601) 


ANTN /"I A T /AIT"* AN TTN "X TAN . A 1 \ 

ORC4L (SEQ ID NO: 41) 


A A AATA A AA ATTAA A A* 1 /CCO TTN XT AN . AO OA 

AGCGrGACGGI 1CGAG (SEQ ID NO: 602) 


/~\TN A A T /CIT"* AN TTN "XT AN . A 1 N 

ORC4L (SEQ ID NO: 41) 


A ATATA A TA< OTTTA* A A 1 /O TT AN TTN TvTAN . /CAT\ 

AG1G1GA1GG1 1 1GAG (SEQ ID NO: 603) 


s~~\T^% /— 1 A T /OP /^\ TTN TV TAN /I 1 N 

ORC4L (SEQ ID NO: 41) 


A TT A ANAnanan A AN TTT AN AN HP /O T? AN TTN TvTAN . iZC\A\ 

Al 1AGGGGAG1 1 1GG1 (SEQ ID NO: 604) 


/^Ti /~t /l t /CI T 1 AN TTN \TA . /I 1 N 

ORC4L (SEQ ID NO: 41) 


TT A AN an TAN a an ' i vi-<r|irp^ TTT ACT? AN TTN XT f~\ . /TOC\ 

1 1 AGG1GAG1 1 1 1G1 1 1 (bEQ ID NO. 605) 


MTTA1 /C1TNAN TTN TvTAN. /ION 

NETOl (SEQ ID NO: 42) 


T A AN an TT AN AN AN TTTT A OA A OC 1 T? AN ttn "XTA^ . A"0/C\ 

1AGG1 1GGG1 1 1 1 AGGA (bEQ ID NO. 606) 


MTTAI /OT /N TTN "X TAN . A ON 

NETOl (SEQ ID NO: 42) 


TT A TA TTT A A TTTT A TA A T /OT? AN TTpv XTA • iCA'7\ 

1 1A1G1 1 1GG1 1 1 1A1GA1 (SEQ ID NO. 60/) 


MTTAI /O T? AN TTN \TA. /tON 

NETOl (SEQ ID NO: 42) 


TT A A A T A A A TTT A A A T /CT7A TTN XTA . AOO\ 

1 1ACGICGG1 1 1GGA1 (bEQ ID NO. 608) 


MTT/NI /flTJA TTN \TA . A ON 

NETOl (SEQ ID NO: 42) 


T r 1 T A T A T r 1 •* A A TTTT A A TT /CTJ 1 A TTN XTA • AOO\ 

1 1 1A1G1 1GG1 1 1 1GA1 1 (SEQ ID NO: 609) 


\TT7T/N 1 /O T? A\ TTN "X T AN . /ION 

NElOl (SEQ ID NO: 42) 


TTA A ATTTA A A A A A A A /CT7A TTN XTA. £L 1 OA 

1 1CGG1 1 IGGGGAAAG (bEQ ID NO. 610) 


MTIT-Ail /CCA TTN NTA . /ION 

NElOl (SEQ ID NO: 42) 


TTT A A' 1 "T" 1 " PA A A A A A A A /OT? A TTN XTA. /CI 1 A 

1 1 1GG1 1 1 1GGGAAAGG (bEQ ID NO. 611) 


VTT7TA1 /CTT\ TTN NTA - /tON 

NElOl (SEQ ID NO: 42) 


TATA AT A A ATA TTT A T /OT~? A TTN XTA • A 1 ON 

1G1GG1AGG1G1 1 1A1 (oJbQ ID NO. 61z) 


KTI7TA1 /O TJ7 AN TTN \TA. /JON 

NElOl (SEQ ID NO: 42) 


A A ' 1 v PTTTA TTA T A TATAT /CT? A TTN XTA ♦ /C 1 O \ 

AA1 111 1G1 1G1A1G1G1 (obvj ID NO. 613) 


TTN A 1 TNO /C1T7AN TTN \TA - /ION 

TBC1D3 (SEQ ID NO: 43) 


T A r PTA AAA A A A A A T f 1 'T /CDO TTN XT AN . O00\ 

1 Al 1CGCGGGGGG1 1 1 (SEQ ID NO: 988) 


rpT»/~ll TNO /OT"?AN TIN X TAN . /ION 

TBC1D3 (SEQ ID NO: 43) 


T A AT A TTTATAA ATA A /0T7O TTN XT AN . O O ON 

1 AG1A1 1 1G1GGG1GG (SEQ ID NO: 989) 


TTiAl 1NO /O "T — 1 A\ TTN XTAN /ION 

TBC1D3 (SEQ ID NO: 43) 


A TTAAAAANAN A A AN A TT A /OT?A TTN XT AN . OOON 

ATTCGGCGGGAGA1 TA (SEQ ID NO: 990) 


' l ri-> /— i -1 TXO /OPA TTN \TA - /i O N 

TBC1D3 (SEQ ID NO: 43) 


A A T AAA TTT A A TA A A A /Cl T? A~\ TTN XT AN . OO 1 \ 

AG1AAA1 1 1GG1GGGA (SEQ ID NO: 991) 


TTN A 1 TNO /ClT'AN TTN \TA - /ION 

TBC1D3 (SEQ ID NO: 43) 


A A A TT A ATAA AAA A A AT /(NT? A TTN XT AN . OOON 

AGA1 1 AG1CGAAAGAGT (SEQ ID NO: 992) 


Tvr> AN t TNO /flP AN TTN "X TAN . /ION 

TBC1D3 (SEQ ID NO: 43) 


A A A A TT A A TTA AAA A A AT /OT7AN TTN XT AN . OOON 

GAGA 1 1 AG 1 TGAAAGAGT (SEQ ID NO: 993) 


TPiAl TNO /C1TT\ TTN XTAN. /ION 

TBC1D3 (SEQ ID NO: 43) 


T A T A TTT A A A A A TTTT A A /Of A TTN XTA. OO /I \ 

I A1A1 1 ICGGGGITTIAA (SEQ ID NO: 994) 


IxiL-lJDj (o-tlQ ID JNU: 4:5 ) 


TATA TTTTnnnnTTTT A A A /QTh A, TTN XTA . QQCN 

1A1A1 1 1 ILtUajajtI 1 1 1AAA ^orJA^ JLD 1NO. yyj) 


GRB7 (SEQ ID NO: 44) 


ATAGTTTCGTTATTTGTAT (SEQ ID NO: 1062) 


GRB7 (SEQ ID NO: 44) 


GGTATAGTTTTGTTATTTG (SEQ ID NO: 1063) 


GRB7 (SEQ ID NO: 44) 


TTTAGTACGGGGTGTA (SEQ ID NO: 1064) 


GRB7 (SEQ ID NO: 44) 


TTTTAGTATGGGGTGTA (SEQ ID NO: 1065) 


GRB7 (SEQ ID NO: 44) 


GGCGTTATAGTTACGTTT (SEQ ID NO: 1066) 
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Gene 


Ohgo: 


GRB7 (SEQ ID NO: 44) < 


GGGTGTTATAGTTATGTT (SEQ ID NO: 1067) 


GRB7 (SEQ ID NO: 44) 


TGTTTATCGAAGGTAGA (SEQ ID NO: 1068) 


GRB7 (SEQ ID NO: 44) 


~f>y — if i ii i in-i A mm a a y~1 / — (rp A f( A A /*f1T — 1/\ Tl — v ~x TyN -1 fi y^" f\N 

TGTTTATTGAAGGTAGAA (SEQ ID NO: 1069) 


d — t t T*TX (~\ T — x y~ / ft T"~> y — V TT^V Tv T S \ A f X 

CYP2D6 (SEQ ID NO: 45) 


/N A yN A m fl yN ft fl rnrfiT" - i""' /N / — t T 1 /OTl/N TTN "XT/N O yi yi N 

GAGATCGCGTTTTCGT (SEQ ID NO: 844) 


CYP2D6 (SEQ ID NO: 45) 


A fl A ft A f — i — \r~i — i > — « m y — « n if i» | 11 | HT^/N T 1 /fTH/N TTN \Tf\ O yl f N 

AGAGATTGTGTTTTTGT (SEQ ID NO: 845) 


y — t ~V r'l /-\ — x f ft "1 — 1 y— v T1 — \ ~V T fN yl f N 

CYP2D6 (SEQ ID NO: 45) 


A Tirpfi f» fi f« f* y— i a fi fl A rp A /fl T~i f\ TTN "X T/N O yi f N 

ATTCGCGGCGAGGATA (SEQ ID NO: 846) 


, — t "X" TTT\ rX T"X -f / ft TT~1 f\ 1 I "X "X T y"N yl f N 

CYP2D6 (SEQ ID NO: 45) 


fi a nnnpTi ft r-r-i ft fi fffi a ft fi a Tt /*n T~t /n ttn "k Tf\ o a j~~t\ 

GATTTGTGGTGAGGAT (SEQ ID NO: 847) 


ftT 7*TXf*TN f / fi 1 — > yN TF\ "X T y"N A f N 

CYP2D6 (SEQ ID NO: 45) 


/ — ( rp fi fi rTnnrtr-rt ft ft ft ft fl A ft ft nn / ft ~r~i fx TTN "X T /N O yi O \ 

GTCGTTTCGGGGACGT (SEQ ID NO: 848) 


ft""C 7"TV^T\ y" y"ft"l — 1 fv TT\ "X T ✓ — \ yi f \ 

CYP2D6 (SEQ ID NO: 45) 


ft T t> I 't ft r-r~u I » I tr I tft ft ft ft A rpfl rnfl /nTH fV TTN *X Tfv O yi f\N 

GTTGTTTTGGGGATGTG (SEQ ID NO: 849) 


CYP2D6 (SEQ ID NO: 45) 


T~t A A finn A ft y — t ft m fl .ft A rp A ft /"f*IT — IfV TTN "XTfY O f ftN 

TAAGTAGCGTCGATAG (SEQ ID NO: 850) 


CYP2D6 (SEQ ID NO: 45) 


A A firp A ft rp ft mTi / — t A l-Tt A fl ft fl / ft T — i fv TTN "VTA Of 1 N 

AAGTAGTGTTGATAGGG (SEQ ID NO: 851) 


CDK6 (SEQ ID NO: 46) 


rp A ft ft A A rp ft ft fl rp ft ft ft ft fl fflT — 1ft TTN "X T/N O yf ✓"N 

TACGAATGCGTGGCGG (SEQ ID NO: 866) 


ft "1 — "V T 7~ S~ / y~i ~t — 1 y — V TT~X "V T fX yl ✓"" N 

CDK6 (SEQ ID NO: 46) 


rp A rpy — t A A rp y — t rp ,fi rp y-^t y — t rp ft y^t A fftT^fV TTN "X T ft Of rj\ 

TATGAATGTGTGGTGGA (SEQ ID NO: 867) 


CDK6 (SEQ ID NO: 46) 


Hpirprp ft f-t f-1 A fl rp A fi ft ft f>t A fl / ft T — lf\ TTN "X Tf\ Of ON 

TTTCGGAGTAGGCGAG (SEQ ID NO: 868) 


CDK6 (SEQ ID NO: 46) 


T'T'nnnn ft ft a ftrp a f firn/ — t A ft / ft t — «fv ttn "x Tfv o f on 

TTTTGGAGTAGGTGAG (SEQ ID NO: 869) 


S~~\ T"X T?~ y~ / ft "1 — 1 y~v TT\ "X T i — \. A S~\ 

CDK6 (SEQ ID NO: 46) 


r-p a y~t y — t rpr-p a y — 1 rnrnrp y — t ft ft f*t yN /nT~*f\ TTN "V T/N. OTON 

TACGTTAGTTTCGCGG (SEQ ID NO: 870) 


CDK6 (SEQ ID NO: 46) 


r 1 1 A rp /^~t r~pr-p a y — ^ r i lr i irprp y — 1 rpf^ y — ^ /ft "T i f\ TTX "X T/N. O r7 -f \ 

TATGTTAGTTTTGTGGG (SEQ ID NO: 871) 


CDK6 (SEQ ID NO: 46) 


A rpr-p y^t A ft A ft f! ft ft rpr 1 I ■» y — t / — t /nT>f\ TTN "VTf\ nr»rt\ 

ATTGAGACGCGTTTGG (SEQ ID NO: 872) 


CDK6 (SEQ ID NO: 46) 


y — | A ft A rp ft rp ft f 1 if 1 irp f« y-^« fs rp A X fi T — \ fv "TT — X "X T yN O ^7 O \ 

GAGATGTGTTTGGGTA (SEQ ID NO: 873) 


SEQ ID NO: 47 (SEQ ID NO: 47) 


rp AAA rpr-p y- — i y — 1 a y — | / — |/ — | / — t r|-tr'i irp /"ftT — I A TTN "X T f\ ^ Of yi N 

TAAATTCGACGGGTTT (SEQ ID NO: 1054) 


SEQ ID NO: 47 (SEQ ID NO: 47) 


A rpnnnn y — i A r-p y — 1 y^ y — t rprprp'" 1 t 1 i/N rp yftT — i fv TTN Tk. T/N 1 Of f N 

ATTTGATGGGTTTTTGT (SEQ ID NO: 1055) 


SEQ ID NO: 47 (SEQ ID NO: 47) 


r^r-\rT~\r~r--\r-r~\ y\ rprp y — t y — t / — t y — ^ y — 1 / — t A fl /m — ' fv TTN "KTO "1 O f f N 

TTTTCGTTCGGCGGAG (SEQ ID NO: 1056) 


SEQ ID NO: 47 (SEQ ID NO: 47) 


ri-ir i-iri i ft rpr pf 11 ft fl rp ft y — t A y — i finPHP /m~lO TTN ~VTf\ "1 ATTN 

TTTGTTTGGTGGAGGTT (SEQ ID NO: 1057) 


SEQ ID NO: 47 (SEQ ID NO: 47) 


' 1 if 1 'fl fl ft ft rprprp A rp s—\ / — \ rp y — i rp /npi/x TTN "XT/N -1 O f ON 

TTCGCGTTTATCGTGT (SEQ ID NO: 1058) 


SEQ ID NO: 47 (SEQ ID NO: 47) 


rp y — t y — i rprprp y — 1 rp /~\ rprprp A rprp f( rp y ft T — i fv TTN "X T f\. 1 O f f\N 

TGGTTTGTGTTTATTGT (SEQ ID NO: 1059) 


T — \ / — v TT~X ~V T/X /I /- 7 y'flT — I y*~N. 1 "p\ "X T y^~"V yl /— T\ 

SEQ ID NO: 47 (SEQ ID NO: 47) 


rp r-p r-p ft y^l ft ft /~t rprpfi y — ^ ri~i A ftrp /CTT" 1 /N TTN "K. T/N 1 A/'A\ 

TTTCGCGGTTCGTAGT (SEQ ID NO: 1060) 


SEQ ID NO: 47 (SEQ ID NO: 47) 


riirpri Tft rpft /t mi | tf it ft rp a y — ^ rilrpri-l a /ftT~lO TTN "V T/N "1 O f" 1 N 

TTTGTGGTTTGTAGTTTA (SEQ ID NO: 1061) 


f* 1 — 1 TT"X "X T /X o / fl TT" 1 ,fv T"1~V "V T ✓ — \ A C\ \ 

SEQ ID NO: 48 (SEQ ID NO: 48) 


rnm a ft ft rp ft ft ft ft a ft ft AAA /ftT - t/N TTN Tv T/N .f 1 yi N 

TTAGGTCGGGAGGAAA (SEQ ID NO: 6 14) 


fl "1 — ^ y^v TT\ "X T y^V n / fi "1 — 1 y^v T"T\ -v T y-^v ^ f»\ 

SEQ ID NO: 48 (SEQ ID NO: 48) 


Ttrp A y — t y — t nPT 1 y — t /N ft A / — 1 f! AAA /TIT'O TTN "XT/N y^ 1 r\ 

TTAGGTTGGGAGGAAA (SEQ ID NO: 6 15) 


SEQ ID NO: 48 (SEQ ID NO: 48) 


rnrp a fl A ft ft rp fi fi ft ft ft ft A rp /ftT-i/N TTN "XT/N f 1 y*N 

TTAGACGTGGGGCGAT (SEQ ID NO: 616) 


n "i — 1 ,/""x tt — v *x t/x yi n /n t — i tt\ "x t /""x h on 

SEQ ID NO: 48 (SEQ ID NO: 48) 


ri-irp a fl A r T^/^ r T^/^ /N /N /NHpft A rp /r(T~i/N TTN "X T/N yf T ^-T\ 

TTAGATGTGGGGTGAT (SEQ ID NO: 617) 


C\ T~» XT — X "K T / — \ A C\ / 'ft T — i y*~V TT\ "X Tf» yl ON 

SEQ ID NO: 48 (SEQ ID NO: 48) 


rp A A fl ft rp A ft ft A ft ft ft ri-i f^ rp /OT~*/N TTN "X T/N f 1 ON 

TAAGGTACGAGCGTGT (SEQ ID NO: 61 8) 


ft T — ' y^v TTN "X T/N /I O /fl T — 1 /^v ttx "X t/\ yl ON 

SEQ ID NO: 48 (SEQ ID NO: 48) 


A A ft ft rp a HP / — 1 A OT/TT'/TT/^ /<~1T"" , /N TTN "XT/N- f 1 AN 

AAGGTATGAGTGTGTG (SEQ ID NO: 6 19) \ 


fl"! — » f~~\ T"T^\ \TA yi O /f1"r~»/~\ TT\ 'X T/~\ yf ON 

SEQ ID NO: 48 (SEQ ID NO: 48) 


ft rp A fl A ftrp A ft ft A fl A fl A ri tr ri / ft ~\ 1 fv TTN Tv T/N ff\ f\N 

GTAGAGTACGAGAGATT (SEQ ID NO: 620) 


fl 1 — > /~\ TTPX "X T/~\ yi O /"fl TT - ' TT\ "X T/N yi ON 

SEQ ID NO: 48 (SEQ ID NO: 48) 


ft ft rp a fl A fl rp a rpfl A ft A / — t A rp y»ft "p fv TTN "X T/N f f\ -1 N i 

GGTAGAGTATGAGAGAT (SEQ ID NO: 621) 


A T^v A O X fl T — i y~-\ TT*v \TA A f\N 

ABCA8 (SEQ ID NO: 49) 


A rpf i if I'tfi /Nrpf ■ if I'tft ft A A / — irnfTTt /ftT — » /N TTN "XT/N f \ fv y" \ 

ATTTGGTTTCGAAGTTT (SEQ ID NO: 996) 


A "1 — ^ A O /ft TT" > y^v "t"*i — x ~v T /X A fi\ 

ABCA8 (SEQ ID NO: 49) 


rp A rprprp y— t y-~t rprinr 1 1 irp y — t A A ft rprni ■ i /pi tti A TIN Tv T A /NfVTN 

TATTTGGTTTTGAAGTTT (SEQ ID NO: 997) 


ABCA8 (SEQ ID NO: 49) 


mmrrirn f*t fl ft A a nprp j — ^ y—t y — t f~t rp ✓ ft -w~i y — v tt\ "X T /N. yx r\ o"\ 

TTTTCGGAATTCGGGT (SEQ ID NO: 998) 


A T~v y\ A O /"ft"! - 1 y~v Tl — V "V T/\ ,-4 f\\. 

ABCA8 (SEQ ID NO: 49) 


>-pri~v»"if 1 t y"t y~t A A rpnprp y — i y-^« y — t rp /N rp / fi "I — 1 fv TTN "X TfX f \ fv f\\ 

TTTTGGAATTTGGGTGT (SEQ ID NO: 999) 


A y*~*t A O /* ft Tn y^v T'I'X "V Tf\ yi fVN 

ABCA8 (SEQ ID NO: 49) 


f~T-~^t-j—\r-T-\ y — t y — t fi rpr'inrpr ■ irp A A fl y — t f~i rp y*ft 'T ' > fx TTN Tv TfV "1 O O /NN 

TTTCGGTTTTTAACGGT (SEQ ID NO: 1000) 


A Tf» y""t A O /* fi T~t y"*V TT^V "X T/N yi /N\ 

ABCA8 (SEQ ID NO: 49) 


r-r~\r-T~-\r—T~-*r-f--\ / — i fN rprpr 1 "if 1 A A rp f-t fi rp y — * y"ft T — I A TTN> Tv Tf\. *f O O "1 N 

TTTTGGTTTTTAATGGTG (SEQ ID NO: 1001) 


A TX/T A O /ni~>/N TTN "X T/N yi A\ 

ABCA8 (SEQ ID NO: 49) 


A A A A rprpnrt A fl ft A fl fl fl ft A /ftTflfv TTN Tv T/N "1 OOfvN 

AAAATTTACGAGGGGA (SEQ ID NO: 1002) 


A T» fl A O /PITIfi TTN "XT/N yi ON 

ABCA8 (SEQ ID NO: 49) 


rprp A A A A rpi 1 » 1 ' A rpy-^i A f^t f~l ft ft A X ft T~l fx TTN TvTf\ "1 OOON 

TTAAAATTTATGAGGGGA (SEQ ID NO: 1003) 


bBQ ID NO: 5U (oJiQ ID NO: 50) 


a hp r 1 a rir< a ' i t 1 a TTnr'n/~' a /or?A ttx \Trv. y^oo\ 

A 1 (jACCjA 1 OA 1 lOOCOA (SEQ ID NO: 622) 


SEQ ID NO: 50 (SEQ ID NO: 50) 


GATGATGATTGGTGAGT (SEQ ID NO: 623) 


SEQ ID NO: 50 (SEQ ID NO: 50) 


TTATGACGTTTAATCGT (SEQ ID NO: 624) 


SEQ ID NO: 50 (SEQ ID NO: 50) 


AGTTATGATGTTTAATTGT (SEQ ID NO: 625) 


SEQ ID NO: 50 (SEQ ID NO: 50) 


AATCGAACGTTGGCGT (SEQ ID NO: 626) 


SEQ ID NO: 50 (SEQ ID NO: 50) 


AAATTGAATGTTGGTGT (SEQ ID NO: 627) 
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Ohgo: 


bEQ ID NO: 51 (oEQ ID NO: 51) 


TATTCGGGTTTCGCGA (SEQ ID NO: 1070) 


bEQ ID NO: 51 (bbQ ID NO: 51) 


ATTTGGGTTTTGTGAG (SEQ ID NO: 1071) 


bEQ ID NO: 51 (oJbQ ID NO: 51) 


TATTGTTACGCGTCGA (SEQ ID NO: 1072) 


bEQ ID NO: 51 (bEQ ID NO: 51) 


ATTGTTATGTGTTGATTT (SEQ ID NO: 1073) 


bh,Q ID NO: 51 (bEQ ID NO: 51) 


GACGTGTAGGTCGTAT (SEQ ID NO: 1074) 


or? A TT\ XT/^ . C 1 /CCA TT\ \TA. ri \ 

bEQ ID NO: 51 (bEQ ID NO: 51) 


GATGTGTAGGTTGTATT (SEQ ID NO: 1075) 


CCA TT\ \TA. C 1 /CCA TT\ \TA. ri\ 

bEQ ID NO: 51 (bEQ ID NO: 51) 


TTCGGGAACGATTTTT (SEQ ID NO: 1076) 


nr? a T"n\ \ta. c i /cca tf\ \ta. c 1 \ 

bEQ ID NO: 51 (bEQ ID NO: 51) 


GGGTTTGGGAATGATT (SEQ ID NO: 1077) 


MAKKz (bEQ ID NO: 52) 


ATATTTCGGGGGAAGT (SEQ ID NO: 628) 


TV >f A T) Ty" O /0"C A TT\ TvTA. CO\ 

MAKKz (bEQ ID NO: 52) 


TATATTTTGGGGGAAGT (SEQ ID NO: 629) 


IV vf A DT70 /CCA TC* \TA. CO\ 

MAKKz (bEQ ID NO: 52) 


TTTCGTATTTGTCGGA (SEQ ID NO: 630) 


\ A A T> "TV"" O /CCA TT"\. XT A . C 0\ 

MAKKz (bEQ ID NO: 52) 


TTTGTATTTGTTGGAGT (SEQ ID NO: 631) 


TV/f ADT/O /CCA T A XTA. CO\ 

MAKivZ (bEQ ID NO: 52) 


GGTTATATCGTAGGGTA (SEQ ID NO: 632) 


A ,f A D T^O /CCA TA XTA. CO\ 

MAKivz (bEQ ID NO: 52) 


GGGTTATATTGTAGGGT (SEQ ID NO: 633) 


\/T ADFO /CCA TA XTA. CO\ 

MAKrv2 (bEQ ID NO: 52) 


AGGGGGACGAATTAGG (SEQ ID NO: 634) 


A/TAD XT It /CCA TA XTA. CO\ 

MAKJv2 (bEQ ID NO: 52) 


/ 1 A /"l A T^/^ A A T^T 1 A /f(Pr\ TTX 'XTA >^ /-\ /— \ 

GAGGGGGATGAATTAG (SEQ ID NO: 635) 


CT TV- 1 /CCA TA XTA . CO\ 

bbJvi (bJbQ ID NO: 53) 


GGICGGGGTTGATTTTA (SEQ ID NO: 920) 


CT T/" 1 /CCA TA XTA. CO\ 

bbKI (bEQ ID NO: 53) 


GGTTGGTGTTGATTTTA (SEQ ID NO: 921) 


CT 1 /CCA TA XTA. CTN 

EEK1 (bEQ ID NO: 53) 


HP/~^ A TTAA A A /H/^( /H y<~1"r~»y"~'V TTX X T/X /X/^/^N 

GTCGGGATTCGAACGG (SEQ ID NO: 922) 


CT V 1 /CCA TA XTA. C") \ 

bbK.1 (bEQ ID NO: 53) 


GTTGGGATTTGAATGG (SEQ ID NO: 923) 


CT TT 1 /CCA TA XTA. CO \ 

bbivl (biiQ ID NO: 53) 


PTAAA A A /^T'T'T/n A A A A / r~l "l — > TTX "VTAV r\s~\ A \ 

GTCGGAAGTTTCGGGA (SEQ ID NO: 924) 


CT TZ" 1 /CCA TA XTA. CO \ 

bbKi (bEQ ID NO: 53) 


/ ^ T'T 1 / — 1 A A /H'T"T"T"T , A( /^l /H A T /T"1T~<yX TTX X T y"X n« r\ 

GTTGGAAGTTTTGGGAT (SEQ ID NO: 925) 


CT T/" 1 /CCA TA XTA. C"2\ 

bbJfcvl (bEQ ID NO: 53) 


A T A Tyi /^i HP A /^t HP A /~i t~" ttx x ta r\^\ s — \. 

ATATCGTAGGGTAGGCGG (SEQ ID NO: 926) 


CT TZ 1 /CCA TA XTA. CQ"\ 

bbJvl (J5b/l^ ID NO: 53) 


Al A I IG1AGGGTAGGTGG (SEQ ID NO: 927) 


nOMTTTTQ /CCA TA XTA. C/l\ 

Klo W U 1 3 (b.bQ ID NO: 54) 


T 1 A /"^ A A /~^>/~^ HP/"""* /^t A T 1 /T~1 T — 1 /^X TTX X TA S~ A y-X 

TAGAACGGCGTGGGAT (SEQ ID NO: 636) 


AQTX7T TTO /CCA TA XTA. C/l\ 

Qo WU 13 (bEQ ID NO: 54) 


T 1 A /^l A A TPA ATATA A A A T / C\1 — ' y^v ttx x t/x >^ /-» 

TAGAATGGTGTGGGAT (SEQ ID NO: 637) 


AOMTTTTQ /CCA TA XTA. C /l \ 

^oWU 13 (bEQ ID NO: 54) 


AiTiAA AA A Ty^T A ATPT A A ATP /ni — ' y*~\ TTX X T/X /'aa\ 

GTCGCGATGTAGTTACGT (SEQ ID NO: 638) 


AO"\17T TT"2 /CCA TA XTA. C/l\ 

QoW U 13 (bEQ ID NO: 54) 


/ ^ T'T 1 T" 1 / < A npAT 1 A AT"T A Ty^T /T^t T — i /x TTX x T /X /-« /\\ 

GTTGTGATGTAGTTATGT (SEQ ID NO: 639) 


AQYXTTTHrO /CCA TA XTA. C A\ 

Qo W U 13 (bEQ ID NO: 54) 


HP HP A y*"""i T'TTy'l /~t /~i y~~< a r-riyx y^i /"n th /X ttx x ta y a a\ 

TTAGTTTCGGGATCGG (SEQ ID NO: 640) 


AOnrr TT'} /CCA TA XTA. C/1\ 

Oo W U 1 5 (bHQ ID NO: 54) 


TTT A ' 1 " 1 ir 1 v 1"'/^* /"^ A T'HP/^ y^i /Ai t-« a TTX X ta\ y A 1 \ 

1 1 1 AGTTTTGGGATTGG (SEQ ID NO: 641) 


AQ\yT TT"2 /CCA TA XTA. C/1\ 

l^o W U 1 5 (oJbQ ID NO: 54) 


T" 1 , /~"^/'" > t ' 1 " I " | " 1 " 1 ^ f~\ f~~\ y"1 A T 1 A /rx T — 1 y~X TTX X T/^X y" yf<Hi\ 

1 1CG1 111 1CGGGATA (SEQ ID NO: 642) 


A£YX7T TT"2 /CCA TA XTA. C 

t^o WU1 j (^oblj ID NO: 54) 


TTTn/^ r T r r r T"T"T lr P/^ A A A A / CI T - i /X TTX X TA y y| Hi \ 

111G111IT1 GGGATAAA (SEQ ID NO: 643) 


r^/rn /cca ta xta. c c\ 
v^vJ±> (oJbVj ID NO: 55) 


T^t* a r~\ f~\ r i^y^i y"~i T>y~n y~t n^ r T ,r rnP n P a yo t -1 y\ ttx \ta at /i\ 

1 1 ACG1CGIGGTTTTTA (SEQ ID NO: 954) 


A/O.TO /CCO TA XTA- CC\ 

v^vXD (^oiiv^ ID 1NO. 55) 


1 1 A1G1 1G1GG1 1 1 1TAG (SEQ ID NO: 955) 


/^/^TO / CCA TA XTA* C C^, 

L/ijrr> (oiiO ID NO. 55) 


y~^i y~^ y~^ t y^t a a HPT^T 1 /^ T 1 y - ^ y^ /ot a ttx x t a a r y\ 

GGCG 1 GAAT1 TCGTGG (SEQ ID NO: 956) 


C^CITX /CCA TA XTA. cc\ 

Lud (ob^ ID NO: 55) 


y"~t r r~ l s~i T , y^ a A r rT , T"T A T A A nr / ci t — ' y~x ttx \ta /xr*^T\ 

GGTGTGAATTTTGTGGT (SEQ ID NO: 957) 


A/~^"D /CCA TA XTA. C C\ 

CCjJd (bEQ ID NO: 55) 


TTTA A A / — t TTT A ' 1 " I 1 */ — ly — 1 y~1 t» //ni — i/x TTX X Ty"X r\ t~ cw 

TTTCGAGTTTATTCGGT (SEQ ID NO: 958) 


A/^TZ> /CCA TA XTA. C C\ 

CGE (bEQ ID NO: 55) 


TTTTGAGTTTATTTGGTT (SEQ ID NO: 959) 


A/^"D /CCA TA XTA. C C\ 

Cvjtd (bJbQ AD NO: 55) 


TT A T'A A A A A npHTA A ati /rti — iy\ ttx x tvx y\y/~vx 

TTATCGCGATGTGCGT (SEQ ID NO: 960) 


O/TO /CCA TA XTA. C C\ 

UCxb (ob(j ID NO: 55) 


A HP' 1 1 A TT/ — innA A T'AnnAT'Anri /ni — i yv ttx x t a — v /~\ s~ -i n 

ATTATTGTGATGTGTGT (SEQ ID NO: 961) 


T> C f~l /CCA TA XTA. C £L\ 

obG (b±sQ ID NO: 5o) 


T A y^ y~< yx >'i if i ^a a yi y*i t^t -1 y~t > i vp /ri"n a ttx x t yx y h a \ 

TACGGTTCGCGTTGTT (SEQ ID NO: 644) 


DOA /CCA TA XTA. C /C*\ 

dou (oblj ID NO: 5o) 


/ ^ / 1 A AT A TA TTTA T'y - ^ nr* /ni — > yx ttx x t / — \ y /< ^ n 

GGAGTATGGTTTGTGT (SEQ ID NO: 645) 


O c /^. /CCA TA XTA. C/T\ 

dou (oblj ID NO: 5o) 


Al t A A y*^t TTA — \ yx / — \ yx a y~> a yn ~« — > y~v ttx x ta y >i y \ 

GTAAGGTTCGGCGAGA (SEQ ID NO: 646) 


RSG fSFO TD NO- 


CYY A A (^C¥T r T r TC\(~y~VC\ A A /CCO TTi XTO- /C/I H\ 
kj 1 /\Z\\JTvjr 111 kjKj i vJ/\vjr/\ ^ oxi \l l\J IN w . 04 / ) 


BSG (SEQ ID NO: 56) 


TTACGTTTTCGGGAAG (SEQ ID NO: 648) 


BSG (SEQ ID NO: 56) 


TTATGTTTTTGGGAAGG (SEQ ID NO: 649) 


BSG (SEQ ID NO: 56) 


TACGTTTCGAGGATCGG (SEQ ID NO: 650) 


BSG (SEQ ID NO: 56) 


TATGTTTTGAGGATTGG (SEQ ID NO: 651) 


BCKDK (SEQ ID NO: 57) 


GGGCGTTAGGCGGATT (SEQ ID NO: 652) 
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AX 7 • 

Oh go: 


BCKDK (SEQ ID NO: 57) 


PA A ATATT A A ATA A A HP / O T7 TT\ "\.T/^\ . /ZC^\ 

rGGGTGTTAGGTGGA 1 (1SEQ ID NO. od3) 


T-» /tTT'TvTT' / C\1 — 1 AX XT" X "K TAX C~7\ 

BCKDK (SEQ ID NO: 57) 


A A~t A AAA ATiT A A~^ A^ A~^ "T -1 A / O T7 TTX TvT/^l. A" AT /I \ 

A.GAGCGGTTAGCGTAG (oEQ ID NO: od4) 


ir\ / — \ T A T — \ TA /OT^/A TT\ X T AX C *~7\ ' 

BCKDK (SEQ ID NO: 57) 


rGAGAG T GGTTAGTG 1 (bEQ ID NO. odd) 


BCKDK (SEQ ID NO: 57) 


ATA A A A AA AATA A A T"T /OH? /^l T"PX TvT/^i. 

Al AGAGGGCGIGAA1 1 (oEQ lUINU. oDo) 


r\ y^TT'TXT/' /fil — ( x"X TT\ X TAX C~T\ 

BCKDK (SEQ ID NO: 57) 


A A"1 A A A ATATA A A 'I 1 '! 11 I'T /O X7 AX TTX \TA\ . A" AT r 7\ 

AGAGGGTGTGAATT1 1 (oEQ ID NO: od 1) 


TTX s~\ x a T — "V XA* /" "I — > AX XT — X "X T /"V \ ' 

BCKDK (SEQ ID NO: 57) 


TP A A A A I ' A A1A1 A /""I A^* AAA / O T~? TTX X TAX , A 4T O \ 

TAGGATTTACGAGGAAA (SEQ ID NO: 658) 


1— k /-tT7"T\T7" /'r^T — I ✓"X XXX "X T AX ^ 

BCKDK (SEQ ID NO: 57) 


A A^ / — t a Miriiri! a f i^i / — t A A^ A~^ A A A A T / O T7 TTX "\TA , SirC\\ 

AGGATTTATGAGGAAAA 1 (SEQ ID NO: 659) 


yx"K rn /*rx i — < ax xxx "x t/*\ A O \ 

SOX8 (SEQ ID NO: 58) 


■ ii ■ » l v V*/~\ /—\ AITH^AI A A A A^< T 1 A /CC/^V TXX XT AX . A"AAV\ 

rTTTCGGTTCGAAGTA (SEQ ID NO: 660) 


s—* yXX; y f» f t~\ 'I * AX XXX "X. T / r— ~V ^ f1\ 

SOX8 (SEQ ID NO: 58) 


T" 1 " 1 " 1 'A A ' \ " PT/^ A A AT A A /O TP AX TTX XT AX . A A "1 \ 

rTTTGGTTTGAAGTAGG (SEQ ID NO: ool) 


n/\'\7'n /*a^"i — i /"~x xxx -v. t AX o\ 

SOX8 (SEQ ID NO: 58) 


A A T A A r 1 " 1 " 1 " 1 " 1 ' A T/^/^ A /O TP f~\ T"P\ 7vT/^\ • /T /TON 

AGG T CGTTTTTA 1 CGA (SEQ ID NO. ooZ) 


i~i s~\ -\ rn y'nx-tAX xxx >Tr\ A O \ 

SOX8 (SEQ ID NO: 58) 


A A ATTATTTTT A HPT A A T /CPP\ TT\ XTA . /T A" <x \ 

AGGTTGTT1 1 1A1 1GAG1 (SEQ 1JJ NO. ooi) 


SOX8 (SEQ ID NO: 58) 


AT A ATT A A A A A A A A TT /OTP TTi \TA\. /T /C /I \ 

u I AG f 1 ACuuCjuCCj 1 1 (SEQ ID NO. oo4) 


SOX8 (SEQ ID NO: 58) 


AT A ATT A T A A A ATA TT /C"DA JT\ XTA . /C/CC\ 

G 1 AGTTATGGGG 1 G 1 1 (SEQ ID NO. ooj) 


n /^xTn /"aix^ax tt\ x tax z - o \ 

SOX8 (SEQ ID NO: 58) 


TATA A T A T A A AAA ATT /C 1 TP f~\ TT~\ "X1V~\ . 

TG T CGTATAGGCGG 1 i (SEQ ID NO: ooo) 


n/\"trn xaii — ■ ax tt\ "xtax /~ O \ 

SOX8 (SEQ ID NO: 58) 


TTATTA T A T A A/^TA/^TT /OTP/^V T"P\ "XT AX . A A""7\ 

TTGTTGTATAGGTGG 1 1 (SEQ ID NO: 66 /) 


XX A S — 1 "1 /'At 1 — t y"X XXX X T/\ A /X \ 

DAG1 (SEQ ID NO: 59) 


TTPTPAlAinnAl Al AAl Ai A At A A T /CT? P\ T"T\ \TP\. OOA\ 

TTTCGTGGCGGAGAAT (SEQ ID NO: 820) 


-i — v a •* /nx>/™v xxx "v t ax c /~\\ 

DAG1 (SEQ ID NO: 59) 


TPTTTA^TA^ A^TA^ A~* A A^ A A T /CTTPi TTX \TPV . OO 1 \ 

TTTTGTGGTGGAGAA 1 (SEQ ID NO: 821) 


xx a ax 1 /frnA rp\ X tax £"Ai\ 

DAGl (SEQ ID NO: 59) 


T A A^ f~~\ A T A TTT/^/^/^TT /OTP/^V TT\ "KT/~X . 0O0\ 

TACGGATATTTCGG1 1 (SEQ ID NO: 822) 


XX A S—\ -1 /"CIX^AX XXX "X TAX £" A\\ 

DAGl (SEQ ID NO: 59) 


A A TT A T A A A T A TTTT/~^/~^TT /C 1 TP f~\ TT\ "NT/~\ . OTl\ 

AATTA1GGAIA1 1 1 1GG1 1 (SEQ ID NO: 823) 


XX A / — 1 "1 AAtx— *ax XXX "X TAX c c\\ 

DAGl (SEQ ID NO: 59) 


TT A AA A TT A A T A A A TT /OTJO TTX \TO . O ^ /I \ 

TTACGAI ICG I AGG 1 1 (SEQ ID NO: 824) 


XX A A^ 1 /tTPA TXX "X TAX C C\\ 

DAGl (SEQ ID NO: 59) 


T A TT A r 1 v 1 A TA A TTT/^T A A^ A^ T 1 /O TP TT\ "\.T/^\ . OOC\ 

TATTATTATGAI 1 1G1AGG1 (SEQ ID NO: 82D) 


fX T — i X JC A A XX /A1X~»AX XXX X T AX A/X\ 

SEMA4B (SEQ ID NO: 60) 


A A TTTT A A A A A A A A TTT /O TP /""X TT\ "NT/^V . <C/C 0\ 

AG1 Tl TGGGCGCGA 1 1 1 (SEQ ID NO: 008) 


f~1X~iX Jf A A XX Z' A1 T — ' AX XXX X TAX /"A\ 

SEMA4B (SEQ ID NO: 60) 


A A TTTT A A A T A T A A TTT /O TP A"\ TTX XTA. A A/X\ 

AGTTTTGGGTGTGAT1 T (SEQ ID NO: oo9) 


All — il JC A /f XX /*nX~»AX XXX "X TAX A /*V\ 

SEMA4B (SEQ ID NO: 60) 


A AAA A A T A A A TTA AAA A T TP /~\ TTX \TA . ZT'7A\ 

AGCGAATAGAT1GCGGAI (SEQ ID NO: 670) 


A1"l — IX Jf A /I "P» /TIXI A XXX "X TAX A /~\\ 

SEMA4B (SEQ ID NO: 60) 


A ATA A A T A A A TTATA/"* A T /CT?P\ TTX TvT/^X. /T^7 1 \ 

AGTG AAT AGATTGTGGA 1 (SEQ ID NO: 671) 


nnn Jt A A XX /HFI/^ XXX X TAX A/XX 

SEMA4B (SEQ ID NO: 60) 


A AAA A TT A Ai A TTAXA~1A^A^ A T /C1T~ , /^\ TTX "X T . A ^70 \ 

AGCGATTAGATTGCGGAT (SEQ ID NO: 672) 


fil — IX JC A y| XX /*A1X"»AX XXX X T AX AAV\ 

SEMA4B (SEQ ID NO: 60) 


A ATA A TT A A A TT A TZ^^ A T /OTP/~X TTX \TP\ . A"^P \ 

AGTGATTAGA 1 1 GTGGA 1 (SEQ ID NO: 673) 


All — 1 X JC A A XX /ni — 'AX XXX X T AX A A \ 

SEMA4B (SEQ ID NO: 60) 


T A A A A A TTA A A TTTT v 1 1 /O TP /^V TTX \TA. P'7/|\ 

TAGGCG1TCGA1 1111 (SEQ ID NO: 674) 


A1 X~*X JT A A XX /HT'/X XXX X TAX A /X\ 

SEMA4B (SEQ ID NO: 60) 


AAAT A HP/^ TTT/^ A HPT /CPPt TTX \TP\ . A ^7 C \ 

GGGTAGGTG1 1 1GA1 1 (SEQ ID NO: 675) 


A tiAl / O T — 1 AX TXX ~X TAX <-">, \ 

APC (SEQ ID NO: 2) 


A A r 1 'TT A A TT " I"' A A TAAT /O TP f~\ TTX \TA. HO O^ 

GGTT 1CG1 1 1AA1CG1 (SEQ ID NO: y28) 


A TXAX / O T" 1 AX TTX X. TAX OX 

APC (SEQ ID NO: 2) 


a a a tt " 1 "T/^ T " 1 "T A A TTAT A /CCA TTX XT/~\. OOCV\ 

GGG1 1 1 1G1 1 1AA1 1 (jrl A (SEQ ID NCJ. yZy) 


A TX A /O TP /~\ T"T\ \TA . 0\ 

APC (SEQ ID NO: 2) 


TTPPT A TTT' A A A A A AT /OTP A T*P\ \TA . QOn\ 

1 ICulAl 1 1 ACjtC (jrvjrA 1 (SiSQ ID JNU. y5\)) 


APC (SEQ ID NO: 2) 


A A TTT A T A f l v 1 v 1"* A A T A A A /OTP A TTX • AI 1 \ 

GG1 1 1G1A1 1 1 Avjrl (jrvjrA (SEQ ID NvJ. y3\.) 


A T» A~1 fC^ T~* A"X TTX \TA /-X\ 

APC (SEQ ID NO: 2) 


A T A A A A A A TTTT/^/^ A /C 1 TP /^X TTX \TA . AOO\ ' 

A1CGGCGGG1 1 1 1CGA (SEQ ID NO: yjZ) 


A TIA1 /CPA TTX X TAX ^\ 

APC (SEQ ID NO: 2) 


A A TT A A TA A ATTTTT/^* A /O TP TTX \TA . HI 0\ 

AA1 1GG1GGG1 111 1GA (SEQ ID NO: V33) 


A TX A"> /*A1T~>AX XXX X T AX /XX 

APC (SEQ ID NO: 2) 


A TTTTAA A ATTA/^/^T A /OTP/^X TTX \TA . AO A \ 

ATTTTCGAGTTCGGTA (SEQ ID NO: 934) 


A X"k y — \ / At "1 — \ AX XXX X T AX <^ \ 

APC (SEQ ID NO: 2) 


riiriiri "T'nPAl A A^T'TTAXAXT A /~~\ T /OFA TTX ~X TP . C\ A aT\ 

TTTTTGAGTTTGGTAGT (SEQ ID NO: 935) 


/~<1XT7"X Ti~\ A /ni — • AX XXX X TAX O \ 

CDKN2A (SEQ ID NO: 3) 


A AX At AtTTA^lTTT A A A Ann A HP / O T - ' A^\ TTX X. T/^X . A ^7 A\ 

GGCGTTGTTTAACGTAT (SEQ ID NO: 676) 


AXXXXAX T^X A /flPA yr\ X TAX ^ \ 

CDKN2A (SEQ ID NO: 3) 


A~1 A^t ATA TT/^TTT A A T/^ T A /OPPl TTX TvT/~\. /1T~1\ 

GGGTGTTGTT1 AATGTA (SEQ ID NO: 677) 


AXXXTAX. T^X A /flT 1 A\ TTX X TAX 0\ 

CDKN2A (SEQ ID NO: 3) 


A A AAT A TAA A A T A ATT A P<P /C 1 TP {~\ TTX \TA. A^7 OX 

AACG1 A1CGAA1 AG1 1ACGG (SEQ ID NO: 678) 


ATM/"M^ A /C1 AX TTX X TAX 0\ 

CDKN2A (SEQ ID NO: 3) 


A A TAT A TTA A ATA A TT A T A A /CDA TT\ \TA. /CTA\ 

AAIG1 Al 1GAA1AG1 1A1GG (SEQ ID NO: 6/9) 


z^Xi — \T/\ T'X A / AX TTX X TAX ON 

CDKN2A (SEQ ID NO: 3) 


T A A A A TAA A A A A TAA A /OTP A TTX \TA. ZTOAX 

TACGG I CGGAGG1 CGA (SEQ ID NO: 680) 


/>nXT/'\ T1 A /nxt AV XXX X T AX o \ 

CDKN2A (SEQ ID NO: 3) 


T A T/^ ATTA/""* A A ATTA A /CT7A TTX TvT/^\. \ 

TATGG1 1 GGAGGTTGA (SEQ ID NO: 681) 


UorLrz CoJbQ 1JJ JNU. 4) 


TTPrifiTT A fiTTTP AT A T /"CPA TTX \TA. AA/1 \ 


CSPG2 (SEQ ID NO: 4) 


TTTTGGTTAGTTTTGTATT (SEQ ID NO: 905) 


CSPG2 (SEQ ID NO: 4) 


TTCGGGTTATTACGTTT (SEQ ID NO: 906) 


CSPG2 (SEQ ID NO: 4) 


TTTTGGGTTATTATGTTTT (SEQ ID NO: 907) 


CSPG2 (SEQ ID NO: 4) 


TTTAGTCGCGTAGCGT (SEQ ID NO: 908) 


CSPG2 (SEQ ID NO: 4) 


ATTTAGTTGTGTAGTGTT (SEQ ID NO: 909) 
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Qhgo: 


ncinrio /cca tta at /a . a n 

CSPCj2 (SEQ ID NO: 4) 


A A T'T'/^ r~~\ /~~\ A T^T^T A A /OT/'A TTA AT/A Al AN 

AATTCGCGAGTTTACrA (SEQ ID NO: 910) 


AOTAAO /CT?r\ T"T\ AT/"A. A \ 

CSPG2 (SEQ ID NO: 4) 


f~\ A A A A A A A T"T"T I A'TA A /**t f ivii /OT -1 /^ TTA AT/A. A1 1 N 

GAAAAAAATTTGTGAGTT (SEQ ID NO: 911) 


r^n r>DO /O T7 /A TTA AT/A. r\ 

ERBB2 (SEQ ID NO: 5) 


TATA A /"i A A r~\ S~i ATT/^T A /OT^yA TTA ATA . A1 ON 

TGTGAGAACGGTTGTA (SEQ ID NO: 912) 


T7T> DDI /CCM TTA ATA. C\ 

EK13132 (SEQ ID NO: 5) 


TA A A A A TA ATTAT A A A /O C /A TTA ATA. A1 IN 

1 GAGAATGGTTG 1 ACjCj (SEQ ID NO: y 13) 


r'TTHDO /O C /A TT\ \Tr». CN 

ERBB2 (SEQ ID NO: 5) 


TT A A A A ATTTAA /^ A A T /OT? A TTA ATA. A 1 /I N 

1 1 AGGCGTTTCGGCGI (SEQ ID NO: yi4) 


T~»T>lATAO /CCA TT\ AT/A. CN 

ERBB2 (SEQ ID NO: 5) 


TTT A A< /*1 r T'/-^ ' 1 " 1 " 1 " l" 1 /^ /^ TA T /OT -1 /^ TTA AT/A. A1 ^"N 

1 TTAGGTGTTTTGGTG T (SEQ ID NO: 915) 


T -1 T> T~» TA O /C"DA TF\ AT/"A. CN 

ERBB2 (SEQ ID NO: 5) 


T A ATT - ""!" 1 /"^ /^/^t A/~^ A A A A /OT/A TTA ATA A1 SZ\ 

TAGGTTTGCGCGAAGA (SEQ ID NO: 916) 


TTi'nnn /OC/A TP\ \TA C \ 

ERBB2 (SEQ ID NO: 5) 


TTT'/^T/^tT'/^t A A f\ A /~t A /OT^/A TTA \TA. /A 1 ^7N 

TTTGTGTGA AGAGAGG (SEQ ID NO : 9 1 7) 


T~>T>TATAO /OT7 A TT\ \TA r\ 

ERBB2 (SEQ ID NO: 5) 


T A A TP A <PA A A A A A /^ /^t A /OT'/A TTA AT/A A1 ON 

TAATTATCGGAGAAGGA (SEQ ID NO: 918) 


t— i-p> T\ T"l O /OTT/A TT~\ \TA CN 

ERBB2 (SEQ ID NO: 5) 


T A A TT A PT"A A A A A J — 1 A /~1 /OT^/A TTA \TA. A1 /AN 

TAATTATTGGAGAAGGAG (SEQ ID NO: 919) 


O TA /TAT 1 /CTT A TT\ AT/~\. ZTN 

S 1MN1 (SEQ ID NO: 0) 


TT A A A A A ATT* A A A A TT /OT7/A TTA ATA . 1 A1 ON 

11 AGGCGGTTCGGATT (SEQ ID NO: 1012) 


OTA/TAT1 /CT7 A TTA XTA . /TN 

S1JVLN1 (SEQ ID NO: 0) 


TT A A ATA A TTT A A A TT /O T7 /A TTA ATA . 1 A1 1\ 

1 lAGCrlGGl I IGGA1 1 (SEQ ID NO: 1013) 


CTA /TVT1 /OT7A TTA AT^A. ZTN 

S1MN1 (SEQ ID NO: 0) 


T A TAA ATTAA A A A A TT /OC/A TTA ATA . 1 A1 A N 

1 A 1 CGG1 TCGGGAA 1 1 (SEQ ID NO: 1014) 


OTTV/1AT1 /CT7A TTV \TA - /TN 

S 1MN 1 (SEQ ID NO: 6) 


T A TTA ATTTA A A A A TTT /OT7A TTA ATA . 1 A1 CN 

1AI 1GGI 1TGGGAA1 1 1 (SEQ ID NO: 1015) 


OTA /TN.T1 A TTA \TA. /Z\ 

SIMN1 (SEQ ID NO: 6) 


TTTA A A A A A A A A ATT A / O TT? /A TTA ATA. 1 A 1 Z"N 

1 1 1 CGCGCGGAGGTTA (SEQ ID NO: 1016) 


OTTV /TAT1 /OPA TTA "X T f~\ /~\ 

STMN1 (SEQ ID NO: 0) 


PPPPA T^f~i T*/"* /"I A /—I y-t '1 "1 ' A /OT~»/A TTA AT/A. 1 A 1 ^7N 

TTTTGTGTGGAGGTTA (SEQ ID NO: 1017) 


OTA /fATI /OT7 /A TT\ A T/A . /TN 

S1MN1 (SEQ ID NO: 6) 


A AT A A A A A A/^T A T A T A /~^T /OT7/A TTA AT/A. 1 A1 ON 

GGTAAGAACGTATATAGT (SEQ ID NO: 1018) 


OTA /TAT1 /O T7 /A TTA \TA. /"N 

S 1 MN 1 (SEQ ID NO: 6) 


TA AT A A A A A TAT* A T A T A /~< HP /OT7/A TTA ATA. 1 A1 AN 

I GG 1 AAGAATG1 AT AT AG1 (SEQ ID N 0 : 1019) 


OTPA ,f~X T 1 /CT7r\ TTA \TA. /"N 

S1MN1 (SEQ ID NO: 6) 


TTTAA ATT A A TA AA A A /CPA TTA AT/A 1 AO AN 

TT1CGGI IAATGCGGA (SEQ ID NO: 1020) 


onPA /f\Ti /cnr\ Tr\ \ ta . ztn 

S 1MN 1 (SEQ ID NO: 6) 


■ ■■■it-|-»-i-w| » r*/-t/-t' | » i i A A TATA A A /OC/A TTA ATA. 1 AO 1 N 

1111 1GG1 1AA1G1GGA (SEQ ID NO: 1021) 


O TA /fAT 1 /or? A TTA AT/"A. /TN 

S1MN1 (SEQ ID NO: 6) 


T A AATTAA AA A TTTAT /OTJA TTA ATA . 1 AOON 

1ACGI 1CGCGA1 ilGl (SEQ ID NO: 1022) 


CTA /TKT1 /O C S~\ TTA ATA. /T\ 

S1MN1 (SEQ ID NO: 6) 


A A A ATT A TATTTATA A /CCA TTA ATA. 1 AOON 

AGGG1 1A1G1 1 1G1GA (SEQ ID NO: 1023) 


OTA /fAT1 /CCA TTA \TA . /TN 

S1MN1 (SEQ ID NO: 6) 


A A T A A ATA A A TATA A A /OC/A TTA AT/A . 1 AO A N 

GAI ACG1CGGIG1CGG (SEQ ID NO: 1024) 


CPU yTAT1 /CT7Ak TTA \TA /T\ 

STMN1 (SEQ ID NO: 6) 


TA ATA TATTA ATATTA A /OCA TTA ATA. 1 AO CN 

TGATATGTTGGTGTTGG (SEQ ID NO: 1025) 


OTA /TV T 1 /O T -1 /A TT\ \TA /"N 

STMN 1 (SEQ ID NO: 6) 


pp A /"I /~l /~l /~1 /"I A A T""T"< A r 1 1 "» /dPA TTA AT/A "1 AO/"N 

TTACGGCGAGATTATT (SEQ ID NO: 1026) 


OTA JTK T 1 /OC/A TTV A T/A /"\ 

STMN 1 (SEQ ID NO: 6) 


TTTT A TA ATA A /~i A TT 1 A TTT /OC/A TTA ATA. 1 AO ^7N 

TTTTATGGTGAGATTATTT (SEQ ID NO: 1027) 


CinPT/1 1 /OP /A TP\ \TA - ^7N 

STK1 1 (SEQ ID NO: 7) 


A TT A A TAAT/~1 ATTAA /"** /OC/'A TTA AT/A . O O AN 

ATTAA1 CGI CGI TCGG (SEQ ID NO: 880) 


O PT/ 1 1 /np /^~\ Tpv "XT/"A. ON 

STK1 1 (SEQ ID NO: 7) 


A A T'T' A A TTA PP A TTTA A^ /~i /O T7 /"A TTA A T/A . O O 1 N 

GATTAATTGT1 GTTTGGG (SEQ ID NO: 881) 


fipT/i 1 /CT7A TP\ XTA . H\ 

S 1 Kl 1 (SEQ ID NO: 7) 


TAA TAATT A AAA AAA A /OC/~\ TTA ATA. OOON 

TAA1CGIT AGCGGCGG (SEQ ID NO: 882) 


OTT>"1 1 /CHA TTA \TA. r 7\ 

S1K11 (SEQ ID NO: 7) 


TT A A TTATT A A TAA TAA /OCA TTA AT/A. OOON 

1 1AA1 1G1 1AG1GG1GG (SEQ ID NO: 883) 


O TTT" 1 -1 /O C /A TTA ATA . TN 

S 1 K.1 1 (SEQ ID N O: 7) 


A T A A TTTT A A A A A AA A /CCA TTA ATA. O O /I N 

G1CGT 1 1 ILvjCuACjvjA (SEQ ID NO: 884) 


O TT/ 1 1 /CCA TTA ATA. r 7\ 

S 1K1 1 (SEQ ID NO: /) 


A TTA' 1 " 1 " 1 " 1 " 1 'ATA A AA A A /CCA TTA ATA. Q O CN 

Gl 1G1 111 1(j1uA(juA(j (SEQ ID NO: 885) 


O TF 1 1 /CCA TTA ATA . 

S1K1 1 (SEQ ID NO: /) 


TAA TA A A AA AATTAT A /CCA TTA ATA. OOjCN 

1 AA1GAGCGCG1 1G1A (SEQ ID NO: 886) 


OTT/I 1 /OCA TTA ATA. r 7\ 

SI K.1 1 (SEQ ID NO: 7) 


A TA A ATATATTAT A TTT /CCA TTA ATA. OOO'N 

A1GAG1G1G1 1G1 Al 1 1 (SEQ ID NO: 887) 


AAA /OCA TTA ATA. 0\ 

CAy (SEQ ID NO: 0) 


A TAA TTTA A A T A A TTTTT /CTJA TTA AT/A. /COON 

A1GG1 1 1CGAIAA1 1111 (SEQ ID NO: 682) 


AAA /CT7A TTA \TA. ON 

CAy (SEQ ID NO: 0) 


A T/~* A TTTH A T A A TTTTTT /OC/A TTA AT/A. /TOON 

A1GG1 1 1 1GA1AA1 1 1 I 1 1 (SEQ ID NO: 683) 


A A A /CT7A TTA AT/A. ON 

CA9 (SEQ ID NO: 8) 


T/^T A A /"""IT A T A APPAAP A /OTT'/A TTA \TA. /"O A\ 

TGTACGTATAGTTCGTA (SEQ ID NO: 684) 


A A A /OC/A TTA ATA. ON 

CAy (SEQ ID NO: 8) 


pp A A HPAT A T/^T A T A /^TTTAT /OC/A TTA AT/A. /f O CN 

1 1 AA1G1 A1G1 Al AG1 1TG1 (SEQ ID NO: 685) 


A A A /OC/^A TTA ATA. ON 

CA9 (SEQ ID NO: 8) 


A T A T A TAATATATTA AA /OC/A TTA ATA. /OA 

ATATATCGTGTGTTGGG (SEQ ID NO: 686) 


AAA /O T7 A TTA ATA. ON 

CAy (SEQ ID NO: 8) 


A T A T A TTATATATTA A A /OC/"A TTA AT/A . /T O ^7N 

ATA1 A 1 TGTGTGTTGGG (SEQ ID NO: 687) 


AAA /CCA TTA ATA. ON 

CAy (SEQ ID NO: 8) 


A T A ATT A ATAAT A TA AT /OC/A TTA AT/A. Zf O ON 

Al AG1 1 AGTCGi ATGG1 (SEQ ID NO: 688) 


AAA /CCA TTA ATA. ON 

CAy (SEQ ID NO: 8) 


A T A ATT A ATTAT A TAATT /OC/A TTA AT/A. /T O AN 

A1AG1 1AG1 IGIA1GG1 1 (SEQ ID NO: 689) 


T> A "V/C /OCA TTA ATA. AN 

PAX6 (SEQ ID NO: y) 


T A TTA T1 'TAA A r l ^ 1 "* A TT A A /O C /A TTA A T/A . /^AAN 

1A1 iCri 1 TCCrGl 1G1 IACj (oisQ ID JSIO: 690) 


F> A~V/Z /CCA TTA ATA. AN 

rAAo («->EQ ID NO. y) 


1A1 lLrl 1 1 l(jr(ji ILrl lALr (oiiv; AJJ JNU: 05*1) 


PAX6 (SEQ ID NO: 9) 


GGCGACGCGGTTAGTT (SEQ ID NO: 692) 


PAX6 (SEQ ID NO: 9) 


GGTGATGTGGTTAGTT (SEQ ID NO: 693) 


PAX6 (SEQ ID NO: 9) 


TAGGTCGCGTAGATTT (SEQ ID NO: 694) 


PAX6 (SEQ ID NO: 9) 


AGTTTAGGTTGTGTAGA (SEQ ID NO: 695) 


PAX6 (SEQ ID NO: 9) 


TAGCGTATTTTTCGGT (SEQ ID NO: 696) 
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Oh go: 


PAX6 (SEQ ID NO: 9) 


a /^rp/nnn a r T nr r"T^rt~>' P r 1 ^/"""^ /* ^ ' i " i v t /C*\~\ — ' /"X Tl v "x t/"X /"""AT\ 

TAGTGTATTTTTTGGTTG (SEQ ID NO: 697) 


SFN (SEQ ID NO: 


10) 


A /TT"> A /*lynrp/^(/-i a A / 1 HPT" 1 A / C\~V * /"X TTX "X T/"X /~ C\Cl\ 

AGTAGGTCGAACGTTA (SEQ ID NO: 698) 


SFN (SEQ ID NO: 


10) 


AGAGTAGGTTGAATGTT (SEQ ID NO: 699) 


SFN (SEQ ID NO: 


10) 


TTGCGAAGAGCGAAAT (SEQ ID NO: 700) 


SFN (SEQ ID NO: 


10) 


-p/~tm/-x a A s — ^ A s — I m AAA TTT /m — 1 ti — v -v T s~\ r-t r\ -t \ 

TGTGAAGAGTGAAATTT (SEQ ID NO: 701) 


SFN (SEQ ID NO: 


10) 


TTCGAGGTGCGTGAGT (SEQ ID NO: 702) 


SFN (SEQ ID NO: 


10) 


TTTGAGGTGTGTGAGTA (SEQ ID NO: 703) 


SFN (SEQ ID NO: 


10) 


TGTGCGATATCGTGTT (SEQ ID NO: 704) 


SFN (SEQ ID NO: 


10) 


T/TTTl A T A nnnn/"X'T , /~"*nn r T" , /~~ - * /""I /C1T" '/'X TTv "X.T/'X t ~Jf\£T\ 

TGTGATATTGTGTTGGG (SEQ ID NO: 705) 


S100A2 (SEQ ID NO: 


ID 


TTTAATTGCGGTTGTGTG (SEQ ID NO: 786) 


S100A2(SEQ ID NO: 


11) 


TTTAATTGTGGTTGTGTG (SEQ ID NO: 787) 


S100A2 (SEQ ID NO: 


11) 


TATATAGGCGTATGTATG (SEQ ID NO: 788) 


S100A2 (SEQ ID NO: 


11) 


ti a t a nn A /TT i /n t 1 a nn/^nn a nn/— t /cit~ '/"x ttx "xt/"x t o a\ 

TATATAGGTGTATGTATG (SEQ ID NO: 789) 


S100A2 (SEQ ID NO: 


11) 


TGTATACGAGTATTGGA (SEQ ID NO: 790) 


S100A2 (SEQ ID NO: 


ID 


nn A nr/^nn a nn a nn/~i a /ttp a nrnp/^i /— t A /'OT , /"X Tlx \ta ^7 a 1 \ 

TATGTATATGAGTAT TGGA (SEQ ID NO: 79 1) 


S100A2 (SEQ ID NO: 


11) 


AGTTTTAGCGTGTGTTTA (SEQ ID NO: 792) 


S100A2(SEQ ID NO: 


ID 


AGTTTTAGTGTGTGTTTA (SEQ ID NO: 793) 


TFF1 (SEQ ID NO: 


12) 


A /~1 A A T"T"T A nP/~t T 1 A HP A A A A A /""t /CIFA TTX \TA T A /f \ 

AGAATTTATCGTATAAAAAG (SEQ ID NO: 794) 


TFF1 (SEQ ID NO: 


12) 


A A r 1 v 1 w 1 ' A r 1 "~ 1 '/^ T A T* A A A A A /~^/^*nP /OT7 /^V TTTX \TA. HAfN 

AATTTATTGTATAAAAAGGT (SEQ ID NO: 795) 


TFF1 (SEQ ID NO: 


12) 


GGACGTCGATGG I A I T (SEQ ID NO: 796) 


TFF1 (SEQ ID NO 


12) 


AGGGATGTTGATGGTA (SEQ ID NO: 797) 


TFF1 (SEQ ID NO 


: 12) 


A A riA A'T/TTn/TT'n/^ A A A /CtT 1 /^ TTX "X T/'X r»nn\ 

AACGGTGTCGTCGAAA (SEQ ID NO: 798) 


TFF1 (SEQ ID NO 


: 12) 


A A r-r-»^*~1 /~\ f-r~\ y~\ r~r^rT~i /~-\ t | ir'H/-i AAA nn /PCH/X TTX \TA ^7/X/A\ 

AATGGTGTTGTTGAAAT (SEQ ID NO: 799) 


TGFBR2 (SEQ ID NO 


': 13) 


A A A A S~~\ S~~\ rp s~~\ s~\ A / — \ s~~\ nPHPT^TP / T — I TTX X T /^X O iX ✓ — \ 

AAAACGTGGACGTTTT (SEQ ID NO: 896) 


TGFBR2 (SEQ ID NO 


»: 13) 


/ — \ A A A A rp/~trp/^/ — \ A Hf -1 / — \ r T -,,_ T _,r T _,r T^ /riT^/X TTX X T /^X Of\^7\ 

GAAAATGTGGATGTTTT (SEQ ID NO: 897) 


TGFBR2 (SEQ ID NO 


»: 13) 


rp/-^ AAA /nrri/^/^/^nnnn AAA /— "t rp "1 t /~v TTX "K.T/*X OAO\ 

TGAAAGTCGGTTAAAGT (SEQ ID NO: 898) 


TGFBR2 (SEQ ID NO 


•: 13) 


r T^/~~\ AAA /nrT^rTi/n/^T-inp AAA /~1 T 1 / O T — 1 /^X TTX \TA O <X lX\ 

TGAAAGTTGGTTAAAGT (SEQ ID NO: 899) 


TGFBR2 (SEQ ID NO 


»: 13) 


TTGGACGTCGAGGAGA (SEQ ID NO: 900) 


TGFBR2 (SEQ ID NO 


>: 13) 


TTGGATG 1 TGAGGAGA (SEQ ID NO: 901) 


TGFBR2 (SEQ ID NO 


>: 13) 


,,,„„.„■-„■■,(—— — t _ ^ /^t/^l /~< A /""i A A / O T^ /^X TTX \TA. AAO\ 

TTTTCGGGCGGAGAGA (SEQ ID NO: 902) 


TGFBR2 (SEQ ID NO 


>: 13) 


A A /"i /n r rnrT"T"T/n /^t T/n a /C<T7/^\ TTX \TA. AAO\ 

AAGGTTTTTGGGTGGA (SEQ ID NO: 903) 


TP53 (SEQ ID NO: 


14) 


nn a tt 1 A /n/nT/n/no/n/^ a /""< a /ci t? ttx \ta. oro\ 

TATTAGGTCGGCGAGA (SEQ ID NO: 858) 


TP53 (SEQ ID NO: 


14) 


A /~~* /~* r V"~r > /~~* /~* ri r , /~'\ A /~* A A HP HP HP /CPA TTX "\.T/^V. O C C\\ 

AGGT1 GG1 GAGAA1 TT (SbQ ID NO: 859) 


TP53 (SEQ ID NO: 


14) 


npnp/^i /~< nn a ^ >^ /-x/~t /~i a nnnn a /OT"/^\ ttx \ta . o /TAN 

TTCGGTAGGCGGATTA (SEQ ID NO: 860) 


TP53 (SEQ ID NO: 


14) 


T r r r r r r r rA /^t* a /^i /^nn/^ /~~* a hp /ot^/^v ttx x. t/~\ . oz"i \ 

TTTTTGGTAGGTGGAT (SEQ ID NO: 861) 


TP53 (SEQ ID NO: 


14) 


A t a nj^nj^nj - " - ! -1 /^ /~i/^< r i" ir i -i /^i/~^ r-\ /o T? ttx x. t/^\ . ozro\ 

ATATTTTGCGTTCGGG (SEQ ID NO: 862) 


TP53 (SEQ ID NO. 


14) 


ATATTTTGTGTTTGGGT (SEQ ID NO: 863) 


TP53 (SEQ ID NO 


: 14) 


nn a a AA/nrn/-i a nn a /n/nnp /ciFA ttx x t /x o s~ A\ 

TACGACGGTGATACGT (SEQ ID NO: 864) 


TP53 (SEQ ID NO 


: 14) 


' i v pr 1 1 a r-pi/~i a rp/n/nnp/n a nn a nn/^xnn /cif A ttx \ta o/^£"\ 

TTTATGATGGTGATATGT (SEQ ID NO: 865) 


TP73 (SEQ ID NO 


: 15) 


r 1 I < 'T ,f 1 '/^ S~~> /~\ S~y A A AHPT A /'OT'/X TTX \TA fA A 

TTCGTTCGCGAAGTTA (SEQ ID NO: 706) 


TP73 (SEQ ID NO 


: 15) 


/-t s~~\ r iv | »-i-'/^-<nnHpnn/^-< nn/^t a a /rnnnn a i — 1 /^x ttx \ta ^a^7\ 

GGTTTGTTTGTGAAGTTA (SEQ ID NO: 707) 


PLAU (SEQ ID NC 


): 16) 


A A 1 A A ATAArPAA A A A nn /TIT A TTX X T/X '~7/^f>\ 

AAGAGGTCGTCGGGAT (SEQ ID NO: 708) 


PLAU (SEQ ID NC 


): 16) 


A A /"X A A AnprTAirprp/^/-^ a rp / , P1T~*/'X TTX \TA rfrtrt\ 

AAGAGGTTGTTGGGAT (SEQ ID NO: 709) 


PLAU (SEQ ID NC 


): 16) 


1 1 AICCjCCjCjCjI Al 111 (oh,Q WJ NU: /1U) 


PLAU (SEQ ID NC 


): 16) 


TTGGTTATTGTGGGTAT (SEQ ID NO: 71 1) 


PLAU (SEQ ID NC 


): 16) 


TTCGATTTCGTTATTATG (SEQ ID NO: 712) 


PLAU (SEQ ID NC 


): 16) 


TTTGATTTTGTTATTATGAG (SEQ ID NO: 713) 


PLAU (SEQ ID NC 


): 16) 


GTCGTGAGCGATTTTA (SEQ ID NO: 714) 


PLAU (SEQ ID NC 


):16) 


TTGGTTGTGAGTGATT (SEQ ID NO: 715) 
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Gene 


y~v y • 

Ohgo: 


TMEFF2 (SEQ ID NO: 


: 17) 


TATCGTAGTTCGTTCGG (SEQ ID NO: 874) 


TMEFF2 (SEQ ID NO: 


: 17) 


ATTGTAGTTTGTTTGGT (SEQ ID NO: 875) 


TMEFF2 (SEQ ID NO: 


: 17) 


AAA y~*< y^i ri v i in i i f~i^ y~S y^t y-t f I If 1 1 y*~ /* | \ y*~v TT1~^V "Tl. T y~ v /™\ y- v 

AAACGTTTATCGGTTG (SEQ ID NO: 876) 


TMEFF2 (SEQ ID NO: 


: 17) 


AATGTTTATTGGTTGGA (SEQ ID NO: 877) 


TMEFF2 (SEQ ID NO 


: 17) 


r i im y — ^ y>* m i, y^t a a ^ — i a r ¥ 1 A y~1 y^"1 y~^ y™< n~l A / i — 1 tt* x, ~v t yx. y^ *— * yx x 

TTCGTAGAAGAATACGCGTA (SEQ ID NO: 878) 


TMEFF2 (SEQ ID NO: 


: 17) 


TTTGTAGAAGAATATGTGTA (SEQ ID NO: 879) 


ESR1 (SEQ ID NO: 


: 18) 


TGCGGTTGTATACGTAG (SEQ ID NO: 962) 


ESR1 (SEQ ID NO: 


: 18) 


TGTGTGGTTGTATATGT (SEQ ID NO: 963) 


ESR1 (SEQ ID NO: 


18) 


rprn y — ^ y — « rn y — i ' i r ' ■ "t a a rnf~sf~M~i/~l i y^t y*^l A rri a rn /rn *i — 1 y-^v ~t"i -v *v f yv /^v y a v 

TTCGTGTTAGATTTCGATAT (SEQ ID NO: 964) 


ESR1 (SEQ ID NO: 


18) 


rinri ■ i y*t nn y~H nnrn a y* a i ■ i \r y^i A f " i \ A nn y yi "i i y^v t '¥ x *x *t* yx y*v y ^* v 

TTTGTGTTAGATTTTGATAT (SEQ ID NO: 965) 


ESR1 (SEQ ID NO 


: 18) 


A A y — i v — ^ y>i y-s a a A y^4 a y^ y»* y-1 A rn /* yt 1 ^ y — v TTTT^V *Tl t y% y y v 

AACGCGAAAGACGGAT (SEQ ID NO: 966) 


ESR1 (SEQ ID NO 


: 18) 


A m AAA f i"i y^i m y~ h a a a y-t a nn y^l yi a y *n y^v yi ^ ~v t y*v yv y \ 

ATAAATGTGAAAGATGGA (SEQ ID NO: 967) 


ESR1 (SEQ ID NO 


:18) 


y*l y — s ✓ — i s — i A y — ^ y^ a y^i y*4 a r I't/'i'^t'i'i /* y< *¥ — i y^v 'i ¥ "v t yv yv y yw s. 

GGGCGTACGAGGATTT (SEQ ID NO: 968) 


ESR1 (SEQ ID NO 


: 18) 


y*i y*< y^i m yt rn a m y< a y — ^ yn a / ■ iri'U'iM y y>< * y~-v w -x. ^ t y^. yv y yv n. 

GGGTGTATGAGGATTT (SEQ ID NO: 969) 


HSPB1 (SEQ ID NO: 20) 


a yi y*t y*i / *■ 1 a r ■ w » i y^< y~s f i i yn y** y— ^ mm y y< i 1 y^v, Tnr—v "v t y*v yv y» y^ \ 

AGGGTATTCGTCGGTT (SEQ ID NO: 888) 


T* *T 1 T"\ T"* 1 y yi 1 V yv TTl ~v T S — V y\ y\ V 

HSPB1 (SEQ ID NO: 20) 


a y^i y*i y^i m a nnriirin y>* mmyi y^t rnm y y*t i — 1 y\ tt-v ~v t /\ y\ y% a\ 

AGGGTATTTGTTGGTT (SEQ ID NO: 889) 


HSPB1 (SEQ ID NO: 20) 


GAATTCGAGAGCGCGA (SEQ ID NO: 892) 


HSPB1 (SEQ ID NO: 20) 


TGAATTTGAGAGTGTGA (SEQ ID NO: 893) 


1™v a /"I yi -4 y yi -n y^w TT\ tt y — v <4 V 

RASSF1 (SEQ ID NO: 21) 


AGTAAATCGGATTAGGA (SEQ ID NO: 852) 


RASSF1 (SEQ ID NO: 21) 


AGTAAATTGGATTAGGAG (SEQ ID NO: 853) 


RASSFl (SEQ ID NO: 21) 


TACGGGTATTTTCGCGT (SEQ ID NO: 854) 


RASSFl (SEQ ID NO: 21) 


ATATGGGTATTTTTGTGT (SEQ ID NO: 855) 


A yi yi 1 1 -tf y fl "T~l y^v T" T~-V ~K T y~V yv ~i V 

RASSFl (SEQ ID NO: 21) 


r i i y** y* y»* a y^H a y^« y™t y^< y^* y»i / ■ \\ u sr*l"i A /i^j "■" t yv. *w"i v ^ -r* y-v yv yv 

TGCGAGAGCGCGTTTA (SEQ ID NO: 856) 


T^V A yi yi *i — l -f y yi 1 — y yv T"r~x. "Th. T a* — V y^ -1 V 

RASSFl (SEQ ID NO: 21) 


r i ir ■ i y h m y*i a y^4 a y~^ m y^t m y^i mmnn a y y< i 1 yv *■**■ v t yv yv ^ ^^n. 

TTGTGAGAGTGTGTTTA (SEQ ID NO: 857) 


GRIN2D (SEQ ID NO: 24) 


A r"i"ir"r~vi~t y^i a ; tl" ■' i y^t a y 1 ** y~*i yv y^t y*t y y< ■ y^v tt - *v t >^v y v 

ATTTCGATTTGGAGGCGG (SEQ ID NO: 716) 


yi "VV TV "T /**\ T*V\ / yi w i yv TT^v ~V *r* y~V y^ 4 V 

GRIN2D (SEQ ID NO: 24) 


A rinrinmnn y^t A r~T~if~l~^ri~i y — ^ y — a y- ^ y< f ■ i y< yy<*i — i y-v ■ t v ^ t yv 

ATTTTGATTTGGAGGTGG (SEQ ID NO: 717) 


TV <"1 A r ■ i •< y yi i i yv T"T" v. ~v t yv ^\ r~ v 

PSAT1 (SEQ ID NO: 25) 


mm y*t yn m yi y~*i y^ m y^t mm a y™v y^* m y y*i i — t y*v "r-"y-v -v t y~v *t yv v 

TTCGTCGGTGTTACGT (SEQ ID NO: 718) 


PSAT1 (SEQ ID NO: 25) 


TTTTGTTGGTGTTATGT (SEQ ID NO: 719) 


PSA1 1 (SEQ ID NO: 25) 


yi yi y* yi a yi rrim y^t y^t yi yi nn a yi nr 1 y y*i i — ^ /v v -r y-v. yyy^ a\ 

GGC G AGTTC GGGT AGT (SEQ ID NO: 720) 


PS ATI (SEQ ID NO: 25) 


GGTGAGTTTGGGTAGT (SEQ ID NO: 721) 


PSAT1 (SEQ ID NO: 25) 


ATAGTAAACGCGAGGA (SEQ ID NO: 818) 


PSAT1 (SEQ ID NO: 25) 


AGTAAATGTGAGGAGG (SEQ ID NO: 819) 


PSAT1 (SEQ ID NO: 25) 


AAGTTTTCGCGAGCGG (SEQ ID NO: 722) 


PSAT1 (SEQ ID NO: 25) 


AAGTTTTTGTGAGTGG (SEQ ID NO: 723) 


PSAT1 (SEQ ID NO: 25) 


AGGAAGTTCGGCGAGG (SEQ ID NO: 724) 


PS ATI (SEQ ID NO: 25) 


AGGAAGTTTGGTGAGG (SEQ ID NO: 725) 


CYP2D6 (SEQ ID NO: 27) 


TACGACGATTTTCGTT (SEQ ID NO: 726) 


CYP2D6 (SEQ ID NO: 27) 


GAGTATGATGATTTTTGT (SEQ ID NO: 727) 


CYP2D6 (SEQ ID NO: 27) 


rrim yi yi m yi yi a mm A a y^i m y — t y™i yi y y* t "t y^v tw ~% *r y~ v. n yv v 

TTCGTCGATTAAGTCGG (SEQ ID NO: 728) 


CYP2D6 (SEQ ID NO: 27) 


mmm yi mrn yi A rnrri A A y*"i r I'trnn y-N y-* r*r-i y yi n yv tv \ -r* y^v yyy^ yvv 

TTTGTTGATTAAGTTGGT (SEQ ID NO: 729) 


CYP2D6 (SEQ ID NO: 27) 


y m yi y*i y~i y~ s yi y^* a y^< m * y* a yn y^* y d t#— -i /v vti "v TT yv yv V 

GTGGCGCGAGTAGAGG (SEQ ID NO: 730) 


CYP2D6 (SEQ ID NO: 27) 


y t rn yi yi m yi m yi a y< m a y~N a y^i yn y yn ■ \ y\ ttv *r* yv ^» ^ v 

GTGGTGTGAGTAGAGG (SEQ ID NO: 731) 


CYP2D6 (SEQ ID NO: 27) 


A A y - i yi nnmm A y~ i yi m yi rum yi y^H m y y* r i y~ v tttv it x y~v ^» ^\ y* v 

AACGTTTACGTGTTCGT (SEQ ID NO: 732) 


CYP2D6 (SEQ ID NO: 27) 


Q I AA 1 G 1 1 T ATGTGTTTGT (SEQ ID NO : 73 3 ) 


COX7A2L (SEQ ID NO: 28) 


AATTCGATCGCGGGTA (SEQ ID NO: 1086) 


COX7A2L (SEQ ID NO: 28) 


ATTTGATTGTGGGTAGA (SEQ ID NO: 1087) 


PLAU(SEQIDNO: 30) 


TATTTGTCGCGTTGAT (SEQ ID NO: 1044) 


PLAU(SEQIDNO: 30) 


ATTTGTTGTGTTGATGA (SEQ ID NO: 1045) 


PLAU (SEQ ID NO: 30) 


TGTAATTCGGGGATTT (SEQ ID NO: 1046) 
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Gene 


ULigo: 


DT ATT /CCA T A XTA. 1 A\ 

J LAU (oJcAj ID NU: 3U) 


PTAT A A TT f l 'A A A A ATIT /QUA TA XTA. 1 A/1 ^7\ 

1 ICjI AA1 1 ICjCjCjCjAI 1 1 (oJca^ ID NU. 1U4/) 


r*T ATT /CD/^V TT\ XTA . Q A\ 

J LAU (biiQ ID JNU: 3(J) 


A A A A A ATA AAA AAA AT 1 /CCA TT\ XTA. 1 C\A Q\ 

A.o(jrAA(jrl ACCjCjACjAAI (oJca^ ID NU: lU4o) 


r>T ATT /CCA T"A \TA . O A\ 

3 LAU (o.bQ ID JNU: JO) 


AAA A A A T 1 A TA A A A A A TT /C C A TT\ XTA • 1 A A CW 

ACjCjAACj 1 A 1 CjCjACtAA 1 1 (o-cA^ ID NU. 104^) 


r»T ATT /CT7A T A \JA. "5 A\ ' 

J LAU (oi^Q ID JNU: 3U) 


PTA A T r PA A AAA TAA AAT /CCA TA XTA. 1 A C A\ 

1 ICCjI 1 CjCjACjA 1 UCjULj 1 (orl/l^ ID NU. IUjU) 


tp>T ATT /OCA TA \TA. 1 A\ ' 

J LAU (b.bQ ID JNU: 3 U) 


TT" pA ' 1 " 1 'A A AAA TT/^ 1 T A T /CCA TT\ XTA • 1 AC A 

1 1 l(jrl ICjCjACjAI 1(ji(jt1 (ISIjA^ ID NU. 1UM) 


T~%T ATT /CCA TA \TA. O A\ 

J LAU (kliQ -UJ JNU: 3U) 


~TT A A A A A A AT A AAAA A /CCA T A XTA. 1 Ad\ 

1 1 UULtvjAACjt 1 ACCjUULt (obi^| ID NU: llOZ) 


T>T ATT /CCA TA \TA . 1A\ 

rLAU v^J^V 1JL> JNU. 3U) 


TTATAA A A AT A T A TA A /OCA ITS XTA • 1 Ad\ 

1 1 U 1 (jtUAACj 1 A 1 Cjr 1 IjLr (oliy ID NU. IIOJ) 


\ 7* TXT /C C A T A XT A . OA 

V 1 JN (oiJ/Cj ID JNU: 3 1) 


TTP AA ATTA A A A AAA A /CCA TA XTA . 1 AO Q\ 

1 ICijULrl 1 UCaUCjrAAAijr (^oJ^U m NU. lUZo) 


\/TXT /CCA TA XT A. O 1 \ 

V 1 JN (oJb/C^ ID JNU. 3 1) 


TTTP A A TTT A T A AAA A /CCA TA XTA • 1 AOCA 

1 1 KjajvjI 1 luluAAAu V^J^V ^D NU. IVZy) 


\ rrrrk t /OCA TC* XT A. O 1 \ 

V UN (oiiQ ID NU: 31) 


' 1 " 1 " 1 " 1 'A '1 1f 1 ''A A A A f 1 ir 1 'A A A /CCA TT\ XTA. 1 A'2A'\ 

1 1 1 Ivjrl IUCjUCjI ICjAA (oh/^> ID NU. 1U3U) 


\ /TTvT /CCA TT\ XTA , OA 

V IN (oxiQ ID NU: 31) 


TTA' 1" 1 " 1 'ATA' 1 " 1 'A A A AT A /CCA TT\ XTA. 1 AO 1 \ 

1 ICjI 1 ICjICjI 1 CjAACjI A (oliCj ID NU: 1U3 1) 


V IN (oEQ ID NO: 31) 


TA A ATAAAA A AAT A AT /CCA TT\ XTA. 1 AOO\ 

ICjCjCjICCjCCjACjCjI ACjI (obQ 1U JNU: 1U3Z) 


V IN (oliQ ID NO: 31) 


TAA ATT AT A A A AT A AT /CCA T"P\ XTA. 1 

l(jrU(jrl l(jr 1 Cj A(jr(jr 1 AU 1 (o-bCj ID NU: 1U33) 


V IN (SEQ ID NO: 31) 


TTA A A TAAAAATTTAA A /CCA TA XTA. 1 A ^2 ZT\ 

1 IUCjAICjCjUCjCjI 1 IUCjtA (bJbQ ID NU: lU3o) 


V IN (oiiQ ID NO: 31) 


TTT A A TA ATA ATTTTA A /CCA TC\ XTA. 1 HI H\ 

111 CjtA 1 LrLr 1 (jtCj 1 111 UA (oh/C^ ID NU. 1U3 /) 


bULi 1A1 (o.bQ iD NO: 32) 


TTA A A ATA ATTTTA A T /CCA TT^ XTA - A \ 

1 lULrAijrlUCjl 1 1 luAl (oJbv<) ID NU. /34) 


OTTT T1 A 1 /CT?A TTN XTA. 00\ 

bUJLl 1A1 (oliQ ID NO. 32) 


nrnrnr a a attatttt'a a tc\ /crn Tr\ xto« no. ^\ 
1 1 1 LjACj 1 1 Vjr 1 1 1 1 ijrAl (oJ^U ^D 1NU. /dj) 


CTTT T1 A 1 /0"C/~V TT\ "MA. 

SuLl 1A1 I&liO ID NO. 32) 


1 1 ULr 1 k^kj 1 vjr 1 AULrU 1 1 (oblj ID JNU. /DO) 


OTTT HT1 A 1 /CD A TT\ XTA. 0O\ 

oULl 1A1 (oliQ AD NO: 32) 


TTTATTATAT A TAATTT /CCA TT\ XTO- 7Q7\ 

1 1 1 Lrl 1 ul ul Al vjLrl 1 I (o.bU ID NU. /3/) 


CT TT 1 A 1 /Ct?A T"F\ TvTA. f 2^\ 

oULl 1A1 (oliQ ID NO. 32) 


A A A A TTTA ATTT^TA AA /CT7 A JT\ XTA • r 71Q\ 

AUCjAI 1 ILul I 1 lUUrVjr (obi^ ID NU. /do) 


OTTT HP1 A 1 /CCA m XTA. 

oULl 1A1 (oEQ ID NO: 32) 


A A A A TTTTA TTTTT A A A /CCA TT\ XTA. 

A(jt(jtA1 1 1 l(jrl 1 1 1 ILrULr (Ibliv^ ID NU: i 3y) 


OTTT T1 A 1 /CPA TT*V XTA. *5 0\ 

oULl 1A1 (oEQ ID NO: 32) 


1 1 « 1 " I " 1 'AAA TTT A A A TAA A /CCA TP» XTA. HA A"\ 

111 1CCj(jt1 1 (jAACjt 1 UCjCj (bli^> ID NU: /4U) 


SULT1A1 (SEQ ID NO: 32) 


r ■ v | if | v i v ■ prpA A A ATTA A /CCA TA XTA. HA 1 \ 

1 1 1 1 ICjCjI ICjAACjI ICjCj (bbQ ID NU: /41) 


PCAF (SEQ ID NO: 33) 


A A AATAA AT A AAT ATA /CCA TA XTA. AO ZT\ 

A(jrC(jrlC(jr(jrl AUCri Al A (orsvj ID NU: Vou) 


PCAF (SEQ ID NO: 33) 


AAT A ATATTA AT 1 A TAT /CCA TA XTA. AQTv 

LKjIAvjICjI 1 1 A I (jr 1 (JSJiV,) ID NU. yo/J 


PRKCD (SEQ ID NO: 34) 


A TTT 'A A A A TTA A A A TT /CCA TA XTA. r 7/10\ 

All 1C(jtU(jt1 ILAjLtAI 1 (olJAj ID NU: /4z) 


PRKCD (SEQ ID NO: 34) 


A A TTTTATATTTA A A 'TT" 1 /CCA TA XTA. H AT\ 

UA1 1 1 lUl(jrl 1 ICAjAI 1 (obi^ ID NU. /43) 


EGR4 (SEQ ID NO 


': 1) 


A A A AAT A TTT A TAA A A /CCA TA XTA • 1 A A \ 

AALtLAjIAI 1 lAlULrLrA (oJcA^) ID NU: /44) 


EGR4 (SEQ ID NO 


1) 


A A A A AHTAHr A TTT A TTT A A A /CCA TA XTA ♦ "7/1 <\ 

Lru AALr 1 vjr 1 A 1 1 1A1 ICjijrA (^oJbU ID NU. /4D) 


EGR4 (SEQ ID NO 


': 1) 


T A TAA A A A A ATA A ATT /CCA TA XTA. H A /T\ 

1 AlUCjrCjAL,(jr(jrlCCj(jrl 1 (bhsQ ID NU: /4o) 


EGR4 (SEQ ID NO 


': 1) 


A TTT A TTA A A TAA TTA A /CCA TA XTA. 1 A H\ 

Al 1 1A1 1 U<aA 1 ijrijr 1 l(jr(jr (oblC^ ID NU: /4/) 


EGR4 (SEQ ID NO 


'= 1) 


A A A AAT A A AATTTT A A /CCA TA XTA. HA Q\ 

AUIAAj 1 ALrULr 1 1 1 1 ALr (oJc/C^) ID NU: /4o) 


EGR4 (SEQ ID NO 


K 1) 


TA A A A TAT A ATATTTT /CCA TA XTA. HA 

1 LtAUCj 1 Lr 1 ACj 1 kj 1 111 (oJc/1^ ID NU. /4y) 


EGR4 (SEQ ID NO 


»: 1) 


A A A A TT ATA ATTA A A AT /CCA TA XTA- HZfW 

AALul 1 AlAVjrl lUUAvjrl (obij ID NU. Id\j) 


EGR4 (SEQ ID NO 


»: 1) 


A A TATT ATA ClTTTCl A ATTT /CCA TA XT/T* ^7^ 1 ^ 

AA Ivjt 1 1 A 1 A<jr 111 LjALt 111 (oJc,U ^D IN U. Id 1 ) 


TP73 (SEQ ID NO: 


15) 


ATA A A A ATT A /^T/^/^A A /CCA TRXTH. "7 CO\ 

KJ 1 IjrUvjAljr 1 1 Aijr 1 C^jijrA ^obSU ^D iNU. /DZ) 


TP73 (SEQ ID NO: 


15) 


ATAT A A ATT A ATT A A A /CCA TA XTA ■ H^'2\ 

LrluluAul lAul 1 uuA (oblj 11J JNU. /dd) 


TP73 (SEQ ID NO: 


15) 


T A TAA ATTAA A A ATT A /CCA TA XTA. H Z A\ 

IAIUCjCjtI lUvjr(jrA(jl 1A (oJbv^ ID NU: / j4) 


TP73 (SEQ ID NO: 


15) 


AAA ATA TTA A TTT A A A A /CCA TA XTA. 7C C\ 

AULrAlAl KjtLjI 1 l(jr(jrA(jr (bJb(^) ID NU: /dd) 


TP73 (SEQ ID NO: 


15) 


AAA A TAA TTA A A A A TT /CCA TA XTA. H C /Z\ 

ACjACjICLtI ICCjCjtAAI 1 (MiC,) ID NU: /do) 


TP73 (SEQ ID NO: 


15) 


TA AAA A TTA TTT A A AAT /CCA TA XTA. H C H\ 

l(jr ACj ACj 1 1 Cjr 1 1 1 CjCjAA 1 (obQ ID NU: / d /) 


SYK(SEQIDNO: 


19) 


AAA ATT A TAA AATTA A /CCA TA XTA. O O /C\ 

CjrAACjrl lAlCUCCjl ICjCj (obQ ID NU: 826) 


SYK (SEQ ID NO: 


19) 


AAA A ATT A TTA TA TTA A /CCA TA XTA. QT7\ 

AUAAijrl 1A1 ICjIUI Kjtvjt (ISJbC^) ID NU: oz/) 


SYK (SEQ ID NO: 


19) 




SYK (SEQ ID NO: 


19) 


GGGATTGATGTGGTTTA (SEQ ID NO: 829) 


SYK (SEQ ID NO: 


19) 


GTTCGGCGGGAGGAGA (SEQ ID NO: 830) 


SYK (SEQ ID NO: 


19) 


GTTTGGTGGGAGGAGA (SEQ ID NO: 831) 


SYK (SEQ ID NO: 


19) 


AGTCGATTTTCGTTTAG (SEQ ID NO: 832) 


SYK (SEQ ID NO: 


19) 


TAGTTGATTTTTGTTTAGT (SEQ ID NO: 833) 
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Oligo: 


^VT<T f^sFO TD XFO- 1 Q\ 


oo a a n a nTrnmrirTTT ^qt?o td xto- q^/ia 


c VTf /'QFD m XTO- 1 
o I JV yOJ-AJj LU INVJ. iy ) 


^jLtAACjALt 1 1 Cj 1 iaLrLr 1 1 (oxi^ lu JNvJ. odd) 


PTQPFU f^FO TD XTO- OfV* 
norD 1 ^oJqV^ XL/ INVJ. ZU ) 




tt<3PR1 f^po td xto- o(x\ 

JtlorD I ^oJj,V</ IU INVJ. ZUJ 


A r~i r r n rr^'~rr^ r T'~r a tt^^i^t 1 a /^r^. /cx^r^ tt^ xir^- onn 
ALrl ILrlCjl 1 A 1 CjUt 1 AajIot (oJc/Vj lu \\\J. oy 1 J 


PTQPPU f^FO TD XTO- orv* 
rlljJr JDl ^oJj/V,/ LU IN VJ . ZU J 


11111 ii^VJl lAAuuAAAu (obl^) 11J INw. oy4J 


ltcddi /"qthO td xto- on^i 

nor D 1 ^oJ-'Vjj IJjJ INVJ. ZAj) 


TTTTTTTTP TT A A CXCX A A A /QCA T"pv "KT^~\ . Q q c\ 

llllllllLrl 1 A ALtVjAAAvjt (oJ^v i^ JN<j\ &yj) 


Tpc ^cyn td xro- 99^ 

1X2/L3 ^oJ_!/V</ 1U INVJ. AA) 


1 AuAAu i L,LjVjt 1 1 L^Lj 1 kj (oii^ IL^ /Do) 


tthQ ^fo td xro- 99^ 


Art a a r^TTnriTTT^Tori ^qt70 td xto- ^7^q\ 

/\.OrAAijrl IvjLtI 1 lljrltjAj; ^oJc/V^ LU rsKJ. 1 Dy ) 


1 jj/O ^oxz/V^ iu invj. zzj 


OAxnrr^r^oorir^or^o a a n /qThO td xto- 7&n\ 


tpq fwn td xro- 99^ 


J\L L KJKJKJ 1 KJKJ 1 UUAAU 1 (OJ^l^ LL> IN^J. /Ol J 


Tpc /cpn TD XTO- 00^ 
1 JJ/O ^orZ/V^ 11J 1>JU. ZZJ 


1 AvjL^ijrvjrAVJ 1 L^ijrvjrAUrijr 1 (ob^ 1JJ 1NU. / OZJ 


TTnQ f^FO TD XTO- 7^ 
1 JJ/O ^oJcAs^ IU INVJ. ZZJ 


1 AVj 1 ijrijrAljr 1 lUrijrAijrUr 1 (oiivj lu rs\J . /Oj J 


tuc /cpn TD XTO- 99\ 
1 JJ/O ^OJ3v</ llv INVJ. ZZJ 


a A T 1 T^^r^r^T^/^/^i'Tr^rir^ a nr /ctjn tt^ xto. h 
AA 1 1 L^LjCj 1 k^Kj 1 LrLrLrA 1 (o.bv 11J JNU. /o4j 


TP Q /Qpri TD XTO- 77*\ 
JL-C/O ^oJj/V^ iJJ INVJ. ZZJ 


AA1 1 1 \jKj L ILtICjLtvjAI (^oii^ LU JNw. / Oj ) 


PTTV9 f<3T?0 TD XTO- 7^ 


aottooooa a a nr*n a a a ^cT?n tt^xto- 

/\^Jl ^IjVjijrAUrAijrCljrAAA ^O-t/l^/ lu INVJ. y/\J) 


PTT1T? f^FO TD XTO- 0X\ 
rll AZ ^OJC/^/ 1JJ INVJ. Zj J 


A OTT"000 A n A OTO A A A /QT70 TD XTO- Q^7 1 \ 


PTTTT? ^FO TD XTO- 9^ 
Jriiyvz ^oJi/Vjj iu INVJ. ZJ J 


A A O A OTOOOO A OXOr^J-O A /QT^O TD XTO- QTI^t 


PTTX9 TD XTO- 9^ 

111 AZ, ^OJC/V^/ LLJ ^ J i. 


A AOAOTTOOOAOTTOOA ^<5TnO TD XTO- Q7^"\ 
r^/TLvJ/AVj x i vjrvjrvjr/\\jr i i uuA JZ/v^ IU IN w . y / d ) 


pttx9 rwri td xrn- 9^* 

ill AZ ^OJLvV</ 1.U INVJ. Zj 1. 


OOTOOA AOAOTOOOOA /"QFO TD XTO- Q7zlA 


PTTX9 f<sFO TD XTO- 9^ i 

111 AZ ^OJ3V«/ lJ_y INVJ. Zj j 


OOTTOA AOAOTTOOOA /"Q*PO TD XTO- Q7^ 
vJvJ 1 1 vj /\/\ \J/\ \J I I VjrVjrVjr/\ ^oJj/V^ IU INVJ. y ID) 


PTHH5T9 ^ThO TD XTO- 9^ 

ill AZ ^OJj/V</ LU INVJ. Zj ) 


ATOTT 1 A OOOOOTOO A A /QPH TD XTO- Q7/^\ 
AlUrl 1 Avjv^vjvjtVj 1 v^VjrAA (^JcAjj ID IN w. y lO) 


PTHP5T9 /'WO TD XTO- 9^ 

ill AZ ^OJ-yV^/ LU INVJ. Zj ^ 


NT A OT'OOO'TTT^ A ACl ACVT ^QT70 TT^ XTO- QHH\ 
I Avjr 1 vjvjvj 1 1 VjrAAUrAvj: 1 ^orSv^ ID IN kJ . y 1 1 ) 


OP TXT 9 D /QPfl TD XTO- 0A\ 
VJlvllNZlJ v oij/VJ_ LU INVJ. ZHJ 


O A CX A CVTC^nCXCX A TO A TT /'QUO TTA XTO • H 

vjAvjAvj 1 UvjLtLtA 1 vjrA 1 1 (oJZ/V^ ID IN vJ: / OOj 


OPTXT9D ^"Pfl TD XTO- 0A\ 
VJivllNZJL/ ^oj2/VJ_ LU In VJ. Z4- J 


p T n AP T A OTTOOO A TO A T /CT70 T"Pi XTO. H £H\ 

vjtvjAvjAvjt 1 1 vjrvjvjrA 1 VjA 1 (oJbl^ IU IN VJ! /O / J 


OPTXT9D r<5T70 TD XTO- 9zH 
VJlvllNZlJ ^oJC/Vjj LU INVJ. ZH-J 


T A OOOTOO A O A TTTOO /CT70 TT^\ XTO- 7^Q\ 
1 AVjVjvj 1 L^vjAvjA 1 1 1 \J\J \ &\lAy\ ID JNvJ. /Do J 


(TRTXJ9D ^9PO TD XTO- 9/H 
VJXvllNZJL/ ^oJj/V^ LU IN VJ . ZH J 


TT A OOOTTO A O A TTTOO /QT7 O TD XTO- 7/^Q\ 
1 I /\Otvj vj 1 lvjr/\vjr/\l 1 Ivjvj ^oJC/V^J ID INvJ. / Oy ) 


rrPTXT9D ^FO TD XTO- 9^ 


AOTOTOOOOA ATATTO ^QPH TD XTO- 

AVJ 1 U 1 vjUUu A A I A 1 1 VJ ^O-C/^ IU INvJ. 1 I\j) 


OPTXF9D ^91hO TD XTO- 0&\ 

vJlVllN A.U ^OJJ/V^ 1JL/ IN V^l . ZH J 


OTOTOOTOA ATATTPtA A fQT70 TD XTO- 77 D 
Ur 1 vjr 1 vjrvjr 1 vj:/\/\ 1 J\ 1 1 KjPsJ\ \^n\y\ ID IN KJ . til) 


DCAT1 /"QFO TD XTO- 9 4 ^^ 


TTTOO A TTOOOTTT A CX A /^QThO TD XTO- CHQ\ 
1 1 1 v^vjyA 1 iv^VjrVjrl 1 IAvjA ^oJC/V^ 11J INvJ. olio ) 


P^ATI ^FO TD XTO- 9S^ 
rorill ^OXJ/V^ LU INVj 1 . Zj J 


A A TTPtTTTTPt A TTTOOTT /QThO TD XTO- QHCA 
t\/\ 1 lvjrl 1 1 lvjrAl 1 IvjvjI 1 ^olj/V^ 11J INvJ. o\Jy ) 


pc AT1 /'QFO TD XTO- 9^ 


TA ATOOOOOOTOO A TT ^QThO TD XTO- Q 1 r\\ 
I AA1 vjrVjrvjrvjrv^vjr 1 LuAl 1 v oiiV^ 11J INvJ. olUJ 


P^AT1 ^Ffl TD XTO- 9^ 
Jro/\l 1 ^k5Jj/V^ LU i>U. Zj ^ 


TTA A TOOOOTOTTO A TT ^QThO TD XTO- Q1 D 
1 1 J\J\ I VjrVjrVjvjr 1 Vj 1 1 kjJ\ I I (^oiiy 11J IN VJ . oil) 


DCAT1 /^^IThO TD XTO- 9 4 ^^ 
Jro/A.! 1 ^oJcA^ LU InW. Zj ) 


T A TOOT A OOOOTT A OO /CT70 JT\ XTO. Q 1 0\ 

1 A 1 v^vjr 1 Avjv^vjrVj; 1 IAVjvj (^oJqv^ ID INVJ. olZJ 


P9AT1 ^FO TD XTO- 9^ 

IT Oz-Vl 1 ^kjJj/V^ 1JL/ 1NW. Zj ^ 


T A TTOT A OTOOTT A OO A A fQTn O TD XTO. Q 1 Q^ 
1/\1 1 Vj 1 s\kj 1 VjVjt 1 1 /\Vjrvj/\/V ^oJC/V^ ID INVJ. o ID) 


P9AT1 ^9FO TD XTO- 9Vi 


AOOA A OOTT A OTOOTT /'^IThO TD XTO- Q1z!A 
/\Vjrvjr^\/\v^VJ 1 l/\VJiv^VJl 1 ^oiLVJ IU INVJ. ol^J 


P9AT1 fSFO TD XTO- 9S^ 

lOrVl 1 JC/VJJ LU IN \J . ZJ ^ 


TAOOA A TOTT A OTTOTTT /'^ThO TD XTO- fi1 ^ 
1 /\VJVJ/\/\ 1 Vjr 1 1 /AVJ 1 1 VJ 1 1 1 ^oJ_/Vjj IU INVJ. o ID) 


P9AT1 ^FO TD XTO- 9^^ 

JTOi'-Vl 1 ^OJJ/V</ llv INvj. Zj ) 


OOTOOTOOT ATT A TOO A /"QThO TD XTO- 81 K\ 
VJVJ 1 V_^VJ 1 V^VJ 1 J\ I I /\ I VjVj/\ v o C/KJ] IU IN VJ . olO) 


P^AT1 r^FO TD XTO- 9^ 


TOOTTOTTOT A TT A TOO A /"QT70 TD XTO- Q 1 7 > v 
1 VjVj 1 1 Vj 1 IVjlAl 1A1 Vj VjA v oJZ/V<; IU INVJ. oil) 


OOA /'QFO TD XTO- 9^ 


A T A TTT A TTTTOOO A A A TTT /'QT70 TD XTO- /C\ 
A1A1 1 1A1 1 1 IV^VjVjAAAI 1 1 ^O-cAj; ID INVJ. ojOJ 


OOA (^VC\ TD XTO- 9/Ci 
V^vJZA. ^kjJj/Vj/ LU IN \J . ZO ) 


TTATTTTTOOA A A TTT AT A OT /CT70 TD XTO. QQ7\ 
11A11111 VjvjAAAI 1 1 A 1 AVjl ^oliv^ ID INVJ. oj 1) 


OOA f^FO TD XTO- 9^ 


TO A TTTTOTOOTT A TT A TT /CT70 ~\T\ XTO- Q1Q\ 

I VjA 1 1 1 1 Vj 1 V^Vj I 1A1 1A1 1 voJbv^ IU JNVJ. ojo) 


OOA ^Ffl TD XTO- 9A^ 
^UT/tl ^oJZ/Vjj LU IN W . ZOJ 


TTO A TTTTOTTOTT A TT A TT /CT70 TD XTO. QOQ\ 
1 IVjAI 1 1 1 Vj 1 1 Vj 1 1A1 1A1 1 v oliv<J 1IJ INVJ. oDy) 


POA ^FO TD XTO- 9^ 


TA A A TTO A OOTT A TOOT A /CT70 TD XTO* Q/f fl\ 
1AAA1 1 VjAV^Vj 1 1 AIVjVjI A (oiiVj 11J INVJ. o4U ) 


CGA fSEO ID NO- 26) 


AAATTGATGTTATGGTAA A fSFO TDNO- R41 i 


CGA (SEQ ID NO: 26) 


AATTGACGTTATGGTAAT (SEQ ID NO: 842) 


CGA (SEQ ID NO: 26) 


TAAAAATTGATGTTATGGT (SEQ ID NO: 843) 


COX7A2L (SEQ ID NO: 28) 


TTGTTCGAAGATCGTT (SEQ ID NO: 1078) 


COX7A2L (SEQ ID NO: 28) 


GTTGTTTGAAGATTGTTT (SEQ ID NO: 1079) 


COX7A2L (SEQ ID NO: 28) 


TAGCGTAAGGATTCGGT (SEQ ID NO: 1080) 
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O/V cm * 


PO!v"7A9T ^FO TF) NO- 9R1 


rTArrTGTAAGGATTTGGT ( SFO TD NO' 1081 j 

X X rlvJ X vJ X rvrivj VJ/i. XXX VJ VJ X ^UU\<J 11/ IN v./ • 1 vU 1 J 


POX7A9T fSFO IT) XfO* 981 


AGAGTTPGGTTTTTPGTA ("SFO TD NO* 10821 


PDY7A9T fSFO TO NO- 9R1 

L/UA / rx/LL^i ^ui^V^ XJ_/ 1 N v_/ . xLO ) i 


AGAGTTTGGTTTTTTGTA fSFO TD NO* 1083 j 

rxvJ/iVJ XXX V-l VJ X X X X JL X VJ X ri \^OJL_/V/J JLJL/ IN , iUOJ j 


PO^T7A9T fSFO TD MO- 981 


ATTPGTATTTGPGGGTTA fSFO ID NO* 1084» 

!i 1 J. VVJ J./11 JL JL VJ V^ VJ VJ VJ JL X JTx, ^UJ-/\J XJL-^ X > V— ' . X\J*J~ J 


PfYY7A9T fSFO TD NO- 981 


ATTTGTATTTGTGGGTTA i SFO ID NO* 1085^ 

ix. XXX VJ X xx 111 VJ 1 VJ VJ VJ 1 1 xx. \^kJ 1— / 11/ 1>V/ • X\J*<j~J J 


FSR9 fSFO TD NO* 991 

Xj/Oxvt* ^kjJL/V^ xxy inv/. <^*y j ' 


CSEO ID NO* 936 1 

ni i i v^vj/ivj \ jn i i rxv^vj i i ^vjjuv^ xi~f i>v/ t 


F<sR9 fSFO TD NO- 991 

C/OXV^ ^ui-zv^ XJL/ IN V_/ . juZs ) i 


ATTTTGAGGATTATGTTTT TSFO ID NO* 9371 

fx 1 1 X X VJ/i-Vj VJ XV JL irvi VJ 1111 yuL/y 11/ iiv/* J ~J I J 


FSR9 fSFO TD NO- 991 

JC/OXVZ* ^uJ_/\^ JUL-/ 1 N V/ . xJy J 


AGATGGPGTTTTTCGTA ( SFO ID NO* 938 » 

rxvj/i 1 VJ VJ VyVj iixi i v^\j i xv yvj J ' V^ 11/ i^v/. y ~j \j j 


F<3T?9 fSFO TD NO* 991 


TAGATGGTGTTTTTTGTA fSRO ID NO* 9391 

X jTTlVJzX X VJ VJ X VJ X X 1 X 1 X VJ X xx. \^kJ JL-/V/ 11-/ 1 N V-/ • ~J S J 


F<sR? ^FD TD NO- 99^ 


ATTTTPGA ATPGATTTTT (SFO TD NO* 940 1 

r\ X X X 1 VvUrvrv 1 V-/vJrv 1 X X X X yOL\j JLJL/ XN V/ . ./TV/ 1 


F^T?? ("QFO TD NO- 99 


GGAGTATTTTTGA ATTGAT (SFO TDNO* 941 1 

VXVJ.rV.Vj X jTV X X X X X VJjrV.^\. X X VJ r\ X y^OJLrV^ XJlJ INV/ . J7T 1 J 


F^T?9 ^FD TD NO- 991 


AGTTPGAPGGTTTTAG (SFO TD NO* 9421 

rVVJ X X V-" vJ r\L VJ VJ XXX X JTxVJ yuL/\^ XJL/ INV,/. y~TT*-tj 


F<3T?9 ^FO TF)NO- 99 i 


A GGG A GTTTG A TGGTT <\SFO TD NO* 943^ 

rxvj VJ VJ iiVJ XXX VJ r\. X VJ VJ x X y O JLvV,/ XXJ IN V-/ . 7tJ ) 


F<3T?9 fSlFO TD NO- 99 ■ 

CjOIV^- v^JC/V</ JUL/ XNVJ. J 


AGTTTAPGTGATPGAG (SFO TD NO* 944 r 

rxVJ XXi jtW-^VJ X VJirV X V, VJ ZiVJ \^kJJL-/V<^ 1J-/ 1 N v/. J 


F<nR9 fSFO TD "MO- 991 


AGTTTATGTGATTGAGTT TSFO ID NO* 945^ 

jtxVJ 1 X X xx. x VJ X VJxTl X X VJxjlVJ 1 1 ^L>J— /V^ 11-/ 1 > V-/ « 


VTNF fSFO TD NO- 3 1 1 

V 1 IN V^OJC/V^ XX/ IN V/ . D JL J 


GGTGGTATPGATTGAT TSEO ID NO* 10341 

VJ VJ 1 VJ VJ X XX. 1 VyVjn 1 1 VJil 1 \^K-'JL-rV^ 11/ X ~ V/ ♦ X \J J 


VTNT ^FO TD NO- ID 

Y X IN JCyV^ JLJL/ IN VJ , Jlj 


TGGTGGTATTG ATTGAT fSEO ID NO* 10351 

1 VJ VJ 1 VJ VJ 1 xx. X 1 \jn 1 1 V 1 il. _L 1 k_J J J V/ J. > v-' • x V/ / 


VTM ^FO TD NO- ^11 

V 1 IN ^OX-yV^/ XI/ IN V/ . J 1 J 


TAGTOATTCGCGGGGA TSEO ID NO* 1038 ; 

1 / i. VJ X VJ xx 1 1 VJ V> VJ VJ VJ VJzTt \^kJJ_/V^ 11/ J- > V/ • I v/ju ^/ 


VTM ^FO TD MO- ^ 1 1 

V JL IN ^OII/V^ 1JL/ Inv/. Dxj 


TAGTGATTTGTGGGGA TSEO ID NO* 10391 

1 XlVJ 1 VJ XX. Ill VJ X VJ VJ VJ VJ XX. ^UIjV^ llv 1 > V/ < X \J y j 


VTM fSFO TD MO* ^11 

V 1 IN \ OXIvV^/ XX/ IN V/ , D X J 


TTATGTPPtGAGGATGA fSEO ID NO* 10401 

1 X xx. X VJ X V^ VJ VJ/XVJ VJ xx. X VJ xx. yJ— »J_/V^ 1J— ' 1 > V/ . 1 "iv J 


VTM (WO TD MO' ^D 

V UN \^OJC/V<J XX/ INV/. 3 x) 


ATTATGTTGGAGGATGA TSEO ID NO* 10411 

n i i n. i vj i x vj vj xx.\~j vj xx. i vj ^lji-/v^ ijl/ i > v . i \j~ x j 


VTM TWO TD MO* ^ D 

V In l OJC/V^/ 1JL/ InvJ. Jlj 


ATAPGGTTTATGAPGAT fSEO ID NO* 10421 

zx X rxLvJ VJ X X 1 A. 1 VJzxV_/VJrV 1 \^kJJ_/V^ XX-J INV/. 1 \J~X-, f 


VTM r^FO TD MO* ID 

V 1 IN lOJL/V^ JUL/ .J 1 1 


ATATGGTTTATGATGATGG fSEO TDNO* 10431 

/x X xx. X VJ VJ XXX xx X VJ Xx. 1 VJ/X 1 VJ VJ ^LJl-zV^ 11/ 1 > V/ . 1 V^ i — ' ^ 


PP AF fSFO TD MO* 111 

Jrv^rxJT ^OJlVV^ XXJ INv/, „J J) 1 


GAGPGGTAGGTGTPGAA TSEO ID NO* 9781 

VJ /xvj V^ VJ VJ 1 XX V 1 VJ JL VJ 1 vVJiliX ^ L/J— i V^ 11/ 1 ^ V/ • -S I VJ ) 


PPAF f WO TD1MO* IVi 


GAGTGGTAGGTGTTGAA f SEO ID NO* 9791 

VJ/xVJ X VJ VJ X xx. VJ VJ 1 VJ 1 1 VJ 1X.XX. y kJ l— / V^ li/ JL > V/ * y / y J 


PPAF ( SFO TD MO* 

JT v^^rxX ^OLy XX_/ 1NV/. J J J 


TA AGATTTCGCGGGTA fSEO ID NO* 9801 

X riri.UA ill v^ vj v • vj vj vj i n \^ vy i ' 11/ 1 ^ v/ » y j 


PPAF KFO TD MO* 11 1 

JT Lril lOJC/V^/ Xi_/ 1NV-/. J J J 


TGTAAGATTTTGTGGGTA TSEO ID NO* 9811 

X VJ X iwiVJii 1111 VJ 1 VJ VJ VJ 1 X X ^LJl_/\^ 1JL/ i > V/ ♦ y vj x j 


PPAF KPD TD "KTO* 


A GTTP GT A GTTTC GAG (SFO TD NO* 9821 

rivj X X V-'VJ X XX VJ XXI Vv VJ Xx VJ ^ kJl_v v</ 11/ 1 > v/ . / kjxl, ; 


PPAF /'WO TD XTO* 


GTTTGTAGTTTTGAGGA fSEO ID NO* 9831 

VJ XXX VJ X xx VJ X 1 X 1 VJ Xx VJ VJ xx ^uj_/V^ _li_/ i > v/ • y vj~j j 


PPAF f^FO TD MO* 1X\ 


TAGGGPGPGGAGTAGA fSFO ID NO* 9841 

1 xxVJUVJV/VJL'VJVJxxVJ X XxVJxx ^OLV^ 11/ INV/. y vj>™ J 


PPAF ^FD TD l\JO* i 

rLn.i ^OxZ/V^/ XX-/ INV-/. _7 J y 


TAGGGTGTGGAGTAGA /"SEO ID NO* 9851 

X xxLVJ VJ VJ X VJ 1 VJ VJ /x VJ i xivjn ^(Ji/vy xj_/ i > v/ . y vj ^ j 


PPT<TPD ^FO TD >JO* Id i 

1 X\JVv-^JL/ ^kjJC/Y XX-/ J'rj 


ATTTATTTTTPGTTGTAGG (SEO ID NO* 7721 

zx xxx a x x i i i V-'VJ i i vj i xx vj vj v kjj— /v^ 11/ i > ^-y • / / J 


PT?l<rPD i SFO TD 7\TO* 1A i 

XT JVJVv^JL/ lOJC/V^/ XX-/ 1NV_/. J^r J 


TATTTATTTTTTGTTGTAGG ( SEO ID NO' 7731 

X xx. Ill xx llllll VJ l l VJ x x xv — i vj ^ui-yv^ xx\y ± ^ \y . / J 


PT?TTPD i 9FO TD MO* 14 1 


TTTPGGA AAPGGGAAT (SEO ID NO* 7741 

1 x l lvj Uxirvntv/VJ vj vj/xxx i ^kjjljv^ xxy x^iv/. / i ~ j 


PRKPD ( SFO TD NO* 

rJM\vJL/ ^OLy XX-/ INv/. J'-Tj 


TAGTTTTGGAAATGGGA TSEO ID NO: 7751 

1 XX. VJ 1111 VJ VJ X XX X-£ X A VJ V_J X X. \ KJ J—/ Vq ^JU/ X 1 V-^ . / / —s J 


PRTf PD ( SFO TD NO* 14 1 

JT IVJCVv^X-/ ^OLy XX-/ INV-/. D*t ) 


GGACGGAGTTATCGGT (SEO ID NO* 7761 

VJ VJ XX V_->VJ VJ XA.VJ X X XX 1 V — ' \ 1 V 1 X \ KJ JLJ V/ X J — ' X i > — ' . / / w / 


PPT^PD ( ^sFO TD NO* 14 1 

JrXSJVV^JL/ iOJLJ/V</ XX-/ INW. J'-rJ 


GGATGGAGTTATTGGTA TSEO ID NO* 7771 

VJ VJ xx 1 VJ VJ xx VJ 1 1 xx 1 1 VJ VJ 1 xx y^k_71_/V^ 11/ 1 > V . / / / J 


PT?1^PD ( 9FO TD NO* 14 1 

jJrxvxvv-^X-/ ^OXZ/V^/ xx-/ in kj . j?*t^ 


GTTTAGPGGAGGGATA (SFO ID NO* 7781 

VJ X X X xxVJ V> VJ VJxx VJ VJ VJ Xx X xx ^UJLyV^ xxy X. y v/ . / / x> y 


PPTfPD i 9FO TD NO* 14^ 


TGTTTAGTGGAGGGAT fSFO TD NO* 7791 

X VJ XXX xTlVJ X VJ VJxxVJ VJ VJxx X ^OXJ/V/^ XI/ INV/. / / y J 


ESR1 rexonS) TSEO ID NO* 61 s ) 


(SEO ID NO: 7801 


ESR1 (exon8) (SEQ ID NO: 61) 


TTGTTATGGTTTGAGAGT (SEQ ID NO: 781) 


ESR1 (exon8) (SEQ ID NO: 61) 


TTTGTTATAGTTTGAGAGT (SEQ ID NO: 782) 


ESR1 (exon8) (SEQ ID NO: 61) 


TTTGTTACGGTTTGAG (SEQ ID NO: 783) 


ESR1 (exon8) (SEQ ID NO: 61) 


TTTGTTATGGTTTGAGA (SEQ ID NO: 784) 


ESR1 (exon8) (SEQ ID NO: 61) 


TTTGTTATAGTTTGAGAG (SEQ ID NO: 785) 



Table 4: Numbers of censored and relapsed patients in randomly selected sample set of ER+, 
NO, untreated population. 



WO 2005/059172 



- 137- 



PCT/EP2004/014170 





Frequency 


Percentage 


Censored 


276 


66.5 


Distant metastasis 


66 


15.9 


Locoregional relapse 


49 


11.8 


Contralateral breast 


24 


5.8 


Sum 


415 


100.0 



Table 5 : Numbers of censored and relapsed patients in ER+, NO, TAM treated population. 





Frequency 


Percentage 


Censored 


485 


89.6 


Distant metastasis 


31 


5.7 


Locoregional relapse 


20 


3.7 


Contralateral breast 


5 


0.9 


Sum 


541 


100.0 



Table 6: Primers and Amplificates according to Example 3 



Forward 
primer 
SEQ ID 
NO: 


Reverse 
primer 
SEQ ID 
NO: 


Amplificate 
SEQ ID NO: 


Amplificate 
number 


1150 


1151 


1152 


1 


1153 


1154 


1155 


2 


1156 


1157 


1158 


3 


1159 


1160 


1161 


4 


1162 


1163 


1164 


5 


1165 


1166 


1167 


6 


1168 


1169 


1170 


7 


1171 


1172 


1173 


8 


1174 


1175 


1176 


9 


1177 


1178 


1179 


10 


1180 


1181 


1182 


11 



